Category Archives: Blog

Metrics

If you look at Agile, you will see a whole heap of ways to gather and show metrics. These may include, but are not limited to Cumulative flow diagrams, Burn down charts, Story points, Velocity, Lead time, Cycle time Just to name a few.

I’ve seen these used and sometime abused just because it is what you do in Agile.

I tend to do things a little different.

For me, metrics are used for a number of things

  • To look good
  • To help find problems
  • To help plan
  • To see if something made a difference

To look good

These are metrics that I call Vanity Metrics. They make you look good and are generally open to abuse. For example closing all stories at the end of a sprint even if they are not complete or have lots of known bugs so your velocity is nice and high. You then create new stories for the next sprint or bug tickets so your velocity remains high.

In my opinion, this is all for show and only strokes someone’s ego.

It’s is also not the metric that makes it a vanity metric but how it is used. All metrics can be corrupted to become vanity metrics and thus be used for gaming the system.

If it is there to make you look good. Then don’t use it any more. That is my opinion.

To help find problems

Metrics used to help find problems are when you measure something to determine where a problem lies. One method of doing this is to put dots on a card for each day it is in a position on the board. If the card hasn’t moved for a few days/weeks or even months, this may indicate a potential problem. It could show there is a bottleneck somewhere. What is the bottleneck? Could someone be over worked? Is there a lack of knowledge? Is there a wait outside the team? Has the work been prioritized incorrectly? Too little info before the work started?

It doesn’t matter what the cause is, it could be any number of things. The question becomes, How can you prevent it in the future? You may not be able to do anything now, but what measures need to be taken to make the problem go away?

To help plan

By gathering metrics, you can start getting hard data on how long something takes to deliver. It may take 4 hours work, but if it is not delivered for between 1 week and 6 months because of the way you work, then why start now?

If you don’t gather the data you can’t always see how long things can actually take to be delivered.

You can also see how much of what is delivered is actually used. How much is stillborn. How much value a piece of work gives.

How much of your work is planned. How much is unplanned. How much is business driven. How much adds to improvement. How much is innovation.

To see if something made a difference

I do some support work in my role. Part of that support work is every morning when I’m on the on-call roster is to do the daily health check. This involves looking through the issues of the last 24 hours and fixing, resubmitting failures. Every day, I take a record of the time it takes me to fix and resolve each problem, and then record the aggregated time to get a trend. (Note, I do this myself, I do not “tell” anyone else to do it. Other members of my team may choose to do this as well). I then put this into excel and get a trend of the time it takes to resolve these issues each day.

Then, I choose an issue that has caused me the most pain, and work towards making it go away so it never happens again.

The metrics I take are used to determine if fixing the issue made a difference to the overall work. It has. When I first started with the company, the morning health check took a whole day to do. Now it can take up to 15 minutes on a good day. Sometimes it can still take all day, but that is rare. The overall goal of the metric for me is to make it to zero, and keep it at that.

Taking these sort of metrics gives you hard data, even in the short term to see if something you have done, an experiment, a fix, a deployment of new functionality actually made a difference.

Another example could be the deployment of a new feature on a web page. Did anyone use it? You won’t know if you don’t record the data.

Conclusion

Don’t get bogged down in the standard metrics that “Agile” uses. They may or may not be useful to you. What is more useful is to record the metrics to drive a goal, to drive decisions based on hard data, not on intuition, gut feel, or opinion. Sometimes, you have to make up a metric based on the effect, as not everything is directly measurable.
Sometimes, having no metrics at all can be useful. Don’t measure everything, just measure a few things until the goal is reached, then move onto something else.

Creating A Self Signed Certificate

I have been developing for the past 20 years, but I have never generated a self signed certificate until recently. Most of my development work is in house integration work. Little to no security has been required. Any applications I have developed have been deployed in house, and again no security has been required. Most tools I write run under Tomcat, but again, no security required. Everything ran under HTTP, not HTTPS. Getting stuff running under HTTPS seemed to be scary – simply because I never looked into it. Well, things change and I decided to give it a go.
In this first part, I’m going to go through how to generate a self signed certificate. These certificates are only really useful for development work, learning – or in house applications. For anything external facing, I recommend getting a proper certificate from an external provider.

Setting up OpenSSL

  • Firstly, you need to install openssl. For Windows, you can find a copy here,  or do a Google Search. For this tutorial though, I’ll be using linux on EC2 which already has open SSL already installed.
  • On Windows, there isn’t a openssl.cnf file, this needs to be created. Thankfully there are places on the internet where you can get one predefined.
    But here is one I downloaded previously
    openssl
  • On Windows again, we need to set the environment variable OPENSSL_CONF to the path to the openssl.cnf file.
    For example :
    SET OPENSSL_CONF=C:\temp\20190417\openssl.cnf

Generate the Certificate

Note: There are a number of methods to generate certificates, but this is the one that I have used.
openssl req -x509 -newkey rsa:4096 -keyout keyname.key -out certname.cer -days 365
Arguments:
req – certificate request and generation utility
-x509 – Outputs a self signed certificate instead of a certificate request.
-newkey <arg> – This creates a new certificate request and private key.
rsa:<nbits> – generates an RSA key of nbits size. In this case 4096 bit key.
-keyout <keyname.key> – specifies the filename for the key.
-out <certname.cer> – specifies the output filename for the certificate. If not specified, then the output is directed to standard output.
days <numberOfDays> – specifies the number of days that the certificate is valid for. This is used with the -x509 option. If not specified, the default is 30 days.
When you run this, you will be asked a number of questions.
PEM pass phrase : Enter anything you consider consistent with the certificate. In my case, I entered “test”.
Verify PEM pass phrase : re-enter the pass phrase.
Country Name : 2 letter representation of your country.
State or Province Name : Enter as required.
Locality :
And so forth. Enter as best you can. Remember. this is for testing purposes, but for company certificates in use, I suggest you come up with a convention if one does not already exist.
An example call is:
[ec2-user]$ openssl req -x509 -newkey rsa:4096 -keyout mytestcert.key -out mytestcert.cer -days 365
Generating a 4096 bit RSA private key
...............................................................................................++
..................................++
writing new private key to 'mytestcert.key'
Enter PEM pass phrase:
Verifying - Enter PEM pass phrase:
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [XX]:AU
State or Province Name (full name) []:VICTORIA
Locality Name (eg, city) [Default City]:MELBOURNE
Organization Name (eg, company) [Default Company Ltd]:PERSONAL
Organizational Unit Name (eg, section) []:PERSONAL
Common Name (eg, your name or your server's hostname) []:mytestcert
Email Address []:admin@beanietech.com
[ec2-user@ip-172-31-25-116 certificates]$

This will generate 2 files. Your certificate (cer file) which is the public key you can pass on to people you want to exchange secure messages with, and the key file which you keep to yourself. This is used to decrypt messages.

Now, there are 2 types of certificate formats. PEM which is a base64 encoded text file.  You can check these by looking at the cer file in a text editor. It might look like this:

-----BEGIN CERTIFICATE-----
MIIF/TCCA+WgAwIBAgIJAMXICEIvh11rMA0GCSqGSIb3DQEBCwUAMIGUMQswCQYD
VQQGEwJBVTERMA8GA1UECAwIVklDVE9SSUExEjAQBgNVBAcMCU1FTEJPVVJORTER
MA8GA1UECgwIUEVSU09OQUwxETAPBgNVBAsMCFBFUlNPTkFMMRMwEQYDVQQDDApt
eXRlc3RjZXJ0MSMwIQYJKoZIhvcNAQkBFhRhZG1pbkBiZWFuaWV0ZWNoLmNvbTAe
Fw0xOTA1MTMxMTM0NTVaFw0yMDA1MTIxMTM0NTVaMIGUMQswCQYDVQQGEwJBVTER
MA8GA1UECAwIVklDVE9SSUExEjAQBgNVBAcMCU1FTEJPVVJORTERMA8GA1UECgwI
UEVSU09OQUwxETAPBgNVBAsMCFBFUlNPTkFMMRMwEQYDVQQDDApteXRlc3RjZXJ0
MSMwIQYJKoZIhvcNAQkBFhRhZG1pbkBiZWFuaWV0ZWNoLmNvbTCCAiIwDQYJKoZI
hvcNAQEBBQADggIPADCCAgoCggIBAJkyuxteepbwlCiNV01YTR8xAx4dwSaEgeqk
n9OiPU6P72pySG4HbqpqIKpp228w4f7quODk5NKVRzOcCCB+1l74a3y1M2dCvwUH
xP6jgcy28qC/3OooasaiWuFPzG5tzr+z/ZpD0xm19CM0v/hMaZ5MH2/cQm4j2gjW
sGiQC1+HVLtIFMaKUgCpQPR1JeEOXpyDrg8Tzs8x/p8eq0WGdTHe9xLpOCqboptA
qbrRP45ThTJ5QhORGQE8XYxmF4xZQZHfe25fa+h+fzj/WYIqo/Sjx52y0657jvGl

The other type of file is a DER file format. This is a binary format for the certificate. If you look at one of these in a text editor, you will see strange characters.

What we have generated here is a PEM format certificate. PEM format certificates generally have the extension .cer, .pem or .crt.

DER certificates generally have the extension .der.

Viewing the Certificate

To view the certificate, you can just double click it in Windows. It will then give you the details of the certificate:

You can also use the following command to view the certificate:

openssl x509 -in certificateName.cer -text <-noout>

The Optional -noout (don’t include the angle brackets) does not display the certificate on the output.
With our sample certificate created, we get the following output:

[ec2-user@certificates]$ openssl x509 -in mytestcert.cer -text -noout
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
c5:c8:08:42:2f:87:5d:6b
Signature Algorithm: sha256WithRSAEncryption
Issuer: C=AU, ST=VICTORIA, L=MELBOURNE, O=PERSONAL, OU=PERSONAL, CN=mytestcert/emailAddress=admin@beanietech.com
Validity
Not Before: May 13 11:34:55 2019 GMT
Not After : May 12 11:34:55 2020 GMT
Subject: C=AU, ST=VICTORIA, L=MELBOURNE, O=PERSONAL, OU=PERSONAL, CN=mytestcert/emailAddress=admin@beanietech.com
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
Public-Key: (4096 bit)
Modulus:
00:99:32:bb:1b:5e:7a:96:f0:94:28:8d:57:4d:58:
4d:1f:31:03:1e:1d:c1:26:84:81:ea:a4:9f:d3:a2:
3d:4e:8f:ef:6a:72:48:6e:07:6e:aa:6a:20:aa:69:
db:6f:30:e1:fe:ea:b8:e0:e4:e4:d2:95:47:33:9c:
08:20:7e:d6:5e:f8:6b:7c:b5:33:67:42:bf:05:07:
c4:fe:a3:81:cc:b6:f2:a0:bf:dc:ea:28:6a:c6:a2:
5a:e1:4f:cc:6e:6d:ce:bf:b3:fd:9a:43:d3:19:b5:
f4:23:34:bf:f8:4c:69:9e:4c:1f:6f:dc:42:6e:23:
da:08:d6:b0:68:90:0b:5f:87:54:bb:48:14:c6:8a:
52:00:a9:40:f4:75:25:e1:0e:5e:9c:83:ae:0f:13:
ce:cf:31:fe:9f:1e:ab:45:86:75:31:de:f7:12:e9:
38:2a:9b:a2:9b:40:a9:ba:d1:3f:8e:53:85:32:79:
42:13:91:19:01:3c:5d:8c:66:17:8c:59:41:91:df:
7b:6e:5f:6b:e8:7e:7f:38:ff:59:82:2a:a3:f4:a3:
c7:9d:b2:d3:ae:7b:8e:f1:a5:6a:0f:6b:ec:20:fd:
f6:2c:8c:0f:25:c4:83:60:3b:4f:b2:52:72:04:50:
ea:3b:22:36:ee:53:79:72:e1:04:58:eb:91:79:dd:
bd:07:8d:29:7f:14:12:4c:78:66:de:d5:63:00:98:
52:b7:61:d7:7b:7f:75:55:40:1f:87:61:21:97:78:
9a:2f:e3:2a:fb:2f:0f:a3:50:14:b6:6d:56:7e:39:
27:94:d2:83:40:27:f6:d2:2e:57:0d:fc:94:54:3d:
ca:88:b3:75:b4:2f:97:fd:17:4a:8c:0c:78:66:42:
bf:c3:1a:7a:01:ba:3f:9a:fe:79:06:5d:ab:d9:f1:
da:1e:b3:6d:22:99:bd:db:77:9a:8e:68:51:47:d3:
30:5c:74:29:69:d2:0c:af:3d:2c:27:69:d5:b1:73:
9f:7a:d2:8c:d6:6f:9e:1b:d1:52:26:99:9b:3a:7d:
3e:48:53:4d:43:50:70:fe:74:83:32:34:c9:e4:b5:
32:71:37:7c:d5:39:fb:4e:c5:fd:4e:4a:4d:f2:5e:
50:97:1d:81:9b:f1:0b:3a:b9:ec:d0:b9:b4:e1:1e:
28:b6:50:15:70:17:cb:54:36:15:c6:94:fc:46:c9:
6f:7b:7b:59:ab:4f:f1:48:3c:5d:ef:f7:71:18:28:
87:e1:fb:80:ee:89:a9:13:2a:70:c0:8f:d3:ef:01:
81:d8:27:01:a7:11:a1:52:c3:d6:75:0f:9c:bc:42:
cd:5e:2b:46:77:9e:33:d9:92:f7:14:77:e0:44:92:
77:59:2f
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Subject Key Identifier:
99:62:57:CF:A8:41:83:BF:2E:0E:D8:51:D5:98:13:EA:78:B7:75:8C
X509v3 Authority Key Identifier:
keyid:99:62:57:CF:A8:41:83:BF:2E:0E:D8:51:D5:98:13:EA:78:B7:75:8C

X509v3 Basic Constraints:
CA:TRUE
Signature Algorithm: sha256WithRSAEncryption
0d:e3:53:28:e5:e8:9f:2b:e6:31:66:34:91:c9:79:43:53:e2:
33:db:e6:09:77:5a:4e:5f:91:88:a1:25:a6:f1:d5:bf:59:29:
64:b8:42:ca:ef:ba:2b:a7:b0:12:da:19:08:ff:f3:b3:09:f5:
bc:3f:87:fe:05:e2:25:95:0d:8e:48:72:7c:6f:c5:22:2a:dd:
40:d2:d2:a0:03:10:a1:09:00:c5:c8:2c:61:ff:98:5a:45:0c:
54:b5:80:4d:68:23:b9:57:91:71:0d:6f:8f:bb:92:55:70:d8:
3c:48:fd:b2:bb:a9:26:ba:7c:23:90:7e:f5:27:e7:a6:a1:a6:
02:7d:61:0d:bd:d3:d3:55:04:a2:b9:95:67:ce:cf:38:f2:4d:
45:cf:38:7e:35:e5:bb:3d:39:c4:b6:5d:e2:0c:40:87:87:e9:
c4:b7:83:4e:ca:f6:c6:7b:c6:5d:b4:c3:66:ec:f3:71:d1:d9:
b8:79:45:11:12:97:6e:5c:8f:e0:7c:b3:7c:5c:9e:5a:c8:99:
fe:35:9e:35:d4:55:66:c4:9d:8b:2d:d9:d4:05:85:bf:14:31:
d0:9e:d2:a5:64:46:fd:20:67:8f:ab:2b:83:8f:34:a9:0c:14:
6f:06:be:8b:e7:e4:c7:c0:15:61:e4:37:ef:3e:06:ac:73:61:
bc:b1:73:7b:53:3e:1b:5b:00:16:74:aa:1c:98:78:4c:68:1a:
8b:92:74:6c:f7:dd:f1:52:ec:b6:46:29:56:d3:85:46:f7:fa:
37:cf:3a:f5:c4:61:f0:bd:ed:63:e8:63:70:59:40:c1:72:21:
f2:19:f1:55:e7:df:bb:30:0d:2f:8a:bf:80:ae:b3:9e:b8:5c:
4e:13:98:87:80:61:06:94:0f:e3:9d:f0:c4:9c:a9:1b:9d:34:
47:84:1c:05:bb:cf:f8:7f:6d:8e:b5:3b:44:34:77:29:2f:a7:
ef:4e:46:48:0c:53:b9:ec:bc:1c:ec:39:e4:46:30:19:10:d4:
80:03:92:d2:98:ff:e9:57:f4:8b:18:da:94:7b:f1:55:1e:4c:
e7:3e:ce:c8:94:bf:51:6d:5d:94:26:f4:7b:19:2e:98:4f:c6:
24:32:99:a7:28:51:a0:e4:43:7c:14:ac:63:44:ad:19:a3:18:
9f:97:a0:b9:d7:d3:49:09:7b:b9:fd:c3:cb:ed:dd:9f:f0:f3:
10:14:4a:7a:aa:3e:c7:dd:f6:2f:63:90:2f:f7:b2:07:47:c9:
fb:ab:e9:4c:c0:83:0e:00:69:58:e1:e8:c2:a5:09:5b:fb:3f:
d7:52:49:eb:0e:37:e5:0e:f3:4c:2b:00:c7:11:e3:ba:71:b7:
c2:7d:7d:b0:48:da:01:cc

Creating the pkcs12 file

The pkcs12 file is an archive format to store the private key with its x509 certificate. You can use the following command to create the pkcs12 file (with the .p12 extension). This file can be used for the keystore in Java applications such as Tomcat.

openssl pkcs12 -export -in certificateName.cer -inkey keyFilename.key -out p12Filename.p12

You will be asked for a phrase and a password. An example would be:

[ec2-user@ certificates]$ openssl pkcs12 -export -in mytestcert.cer -inkey mytestcert.key -out mytestcert.p12
Enter pass phrase for mytestcert.key:
Enter Export Password:
Verifying - Enter Export Password:
[ec2-user@ certificates]$ ls
mytestcert.cer mytestcert.key mytestcert.p12
[ec2-user@ certificates]$

Again, the password and passphrase is “test” for this example.

Creating A Keystore for Java (.jks)

Make sure you have java installed. You can now create a keystore for Java.

Use the following command to create the keystore:

keytool -importkeystore -srckeystore p12Filename.p12 -srcstoretype pkcs12 -destkeystore jksFilename.jks

So for example:

[ec2-user@ certificates]$ keytool -importkeystore -srckeystore mytestcert.p12 -srcstoretype pkcs12 -destkeystore mytestcert.jks
Importing keystore mytestcert.p12 to mytestcert.jks...
Enter destination keystore password:
Re-enter new password:
Enter source keystore password:
Entry for alias 1 successfully imported.
Import command completed: 1 entries successfully imported, 0 entries failed or cancelled

Warning:
The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore mytestcert.jks -destkeystore mytestcert.jks -deststoretype pkcs12".
[ec2-user@ certificates]$

Note, the keystore password needs to be 6 characters, our password of “test” will not fit, so I used “test123”.

Now, to view the keystores contents, you can use the following command:

keytool -list -v storepass <Password> -keystore jksFilename.jks

So for example:

[ec2-user@ certificates]$ keytool -list -v -storepass test123 -keystore mytestcert.jks
Keystore type: JKS
Keystore provider: SUN

Your keystore contains 1 entry

Alias name: 1
Creation date: Jul 8, 2019
Entry type: PrivateKeyEntry
Certificate chain length: 1
Certificate[1]:
Owner: EMAILADDRESS=admin@beanietech.com, CN=mytestcert, OU=PERSONAL, O=PERSONAL, L=MELBOURNE, ST=VICTORIA, C=AU
Issuer: EMAILADDRESS=admin@beanietech.com, CN=mytestcert, OU=PERSONAL, O=PERSONAL, L=MELBOURNE, ST=VICTORIA, C=AU
Serial number: c5c808422f875d6b
Valid from: Mon May 13 11:34:55 UTC 2019 until: Tue May 12 11:34:55 UTC 2020
Certificate fingerprints:
MD5: 35:AB:6C:B1:60:E7:AB:1E:09:12:80:CE:0C:E0:85:7B
SHA1: FA:3F:23:FF:3B:B3:1E:8A:1B:84:DB:E6:51:DD:75:0C:C5:62:AD:1E
SHA256: 26:22:D7:1D:83:07:11:A0:21:55:91:53:17:0E:6F:19:8F:34:06:37:83:E0:D3:36:6C:D7:69:C1:67:F9:E5:04
Signature algorithm name: SHA256withRSA
Subject Public Key Algorithm: 4096-bit RSA key
Version: 3

Extensions:

#1: ObjectId: 2.5.29.35 Criticality=false
AuthorityKeyIdentifier [
KeyIdentifier [
0000: 99 62 57 CF A8 41 83 BF 2E 0E D8 51 D5 98 13 EA .bW..A.....Q....
0010: 78 B7 75 8C x.u.
]
]

#2: ObjectId: 2.5.29.19 Criticality=false
BasicConstraints:[
CA:true
PathLen:2147483647
]

#3: ObjectId: 2.5.29.14 Criticality=false
SubjectKeyIdentifier [
KeyIdentifier [
0000: 99 62 57 CF A8 41 83 BF 2E 0E D8 51 D5 98 13 EA .bW..A.....Q....
0010: 78 B7 75 8C x.u.
]
]

*******************************************
*******************************************

Warning:
The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore mytestcert.jks -destkeystore mytestcert.jks -deststoretype pkcs12".

Securing Tomcat

Now that we have our certificate and our keystore (for testing purposes) we can now add the certificate to Apache Tomcat.

Place your jks file under your <tomcat home>/conf/SSL folder. This isn’t the standard directory, its just the directory that I like to use to place keystores and truststores in.

Then in the <tomcat home>/conf/server.xml file, find the following entry:

<Connector port="8080" protocol="HTTP/1.1"
connectionTimeout="20000"
redirectPort="8443" />

After this entry, add the following:

  <!-- Added manually after the tomcat installation for SSL configuration -->
<Connector port="443" protocol="org.apache.coyote.http11.Http11NioProtocol"
clientAuth="false"
keyAlias="<Key Alias>"
sslProtocol="TLS"
keystoreFile="conf/SSL/keystoreFilename.jks"
keystorePass="<password>"
truststoreFile="conf/SSL/truststoreFilename.jsk"
truststorePass="<password>
maxThreads="150"
SSLEngine="on"
SSLEnabled="true"
SSLVerifyDepth="2"
/>

Where keyAlias is the alias of the certificate in the keystore. If you look above in the keystore contents, we see the line

Alias name: 1

This is where the value was taken from.

We haven’t created a truststore here, but the same concepts are used to store the truststore certificates.

My entry is the following :

<!-- Added manually after the tomcat installation for SSL configuration -->
<Connector port="443" protocol="org.apache.coyote.http11.Http11NioProtocol"
clientAuth="false"
keyAlias="1"
sslProtocol="TLS"
keystoreFile="conf/SSL/mytestcert.jks"
keystorePass="test123"
maxThreads="150"
SSLEngine="on"
SSLEnabled="true"
SSLVerifyDepth="2"
/>

Now, if you start up Tomcat, you should be able to access it via https.

For example, if Tomcat was installed on your local system, you may go to: https://localhost

Now, you will receive a warning that the site is not safe in your browser. This is because you are using a self signed certificate. Just accept the warnings and allow access. You should then be able to access Tomcat over https.

 

What a waste

When you have so much work to be done, and it is all done on the fly with no

  • Standards and Manual Work, the priority of tasks keeps changing. Do this now, no stop that and do this.  You are constantly
  • Task switching. This leads to
  • Partially done work. Nothing gets really completed on time. Everything gets late. The customer is
  • Waiting for their solution longer. Since it takes so long to get to the customer, and we are under the pump, we make mistakes, cut corners, this introduces
  • Defects that need to be fixed. The buggy code goes back to development, then testing, then UAT again, it is constantly in
  • Motion. The software is so buggy. we introduce gated changes and procedures and other
  • Extra Processes to prevent issues. The deadline looms. Nothing is complete, everyone stays back late into the night and they work weekends. Their
  • Heroics is astounding. We finally ship. We planned everything up front. Did everything the customer initially specified. Oh no! they didn’t really need those
  • Extra Features that we spent months building. It had no business value, it was just a nice to have.

My Rules for Being Agile

First rule of being agile. You will fail! Yep, all those that say Agile doesn’t work are correct. It doesn’t work. Let’s move on.

Second rule of being agile. Find out the thing that prevented you from succeeding.

Third rule of being agile. Try to fix that thing that you found in the second rule.

Fourth rule. Go back to the first rule.

Rule zero. Do this regularly and consistently and eventually the first rule becomes irrelevant because it becomes habit.

This is simple, but it isn’t easy. You will loose motivation. I know I have at times. You will get tired. You will at times get stuck in a rut. It’s easy just keeping things at steady state. You just sometimes have to pull yourself out and keep going. Keep finding the problems and solve them. When you are stuck in a rut, try to stretch outside your comfort zone or do something different. It might inspire you or give you a new perspective.

Can Waterfall Be Agile?

I was sitting in a Drive through waiting in the queue and listening to an Agile Podcast and my mind wandered, which isn’t good.

My thoughts led me to thinking – can Waterfall be Agile? The answer I came up with was yes and no.

Why Yes?

Heresy you say, waterfall is not Agile! Why do I think that Waterfall could be agile? Well, Agile is a mindset. It is thinking that things can be better. It is always looking for ways to improve the current situation. Improvement in the ways we work, reducing the waste preventing work to be done sooner, reducing the amount of work done, inspecting what you are doing and adapting to try to make things better in both flow and quality. It is taking care of the people doing the work. Making sure that they have enough information do do the work. Marking sure they have the skills and/or ability to do the work. Looking for and searching for problems. If there are no problems, then stressing the system to see what breaks and then fixing the issues that arise. It is looking at the system as a whole. This mindset can be applied just as well to Waterfall as it can be in Agile. The biggest problem with Waterfall that I see is that this type of thinking isn’t applied. I have been in waterfall projects, where the process is followed blindly. People grumble about the issues, but nothing is done about them, or at the end of the project, a meeting is called to look at ways to improve the next time. The suggestions are gathered, documented and filed away. Those suggestions are not brought up again for the next project. So, the same mistakes are made over again. This behaviour is very common, it is probably one of the main reasons why people hate Waterfall. You are stuck in a bad situation and nothing changes to get out of it.
When a company that has this culture takes on Agile, the same behaviours can still remain, which may be a reason why so many developers also hate Agile.

Why No?

Now, assuming that you do have the right mindset, you do look at your issues and you do try to solve them, and you still do Waterfall, this is where I start to think that Waterfall cannot be Agile. Its quite obvious, the feedback cycle between leaning of a requirement and then learning if that requirement actually solves a problem or gives value is quite long. It can be months or years before something developed is actually used in anger. The ability to learn and change is much longer. It is just the nature of the Waterfall process. It does not mean though that you cannot add learning.

Now, my thinking is that if you apply the Agile Mindset to a Waterfall process using the something like Lean or Kanban, and by Kanban I don’t mean just putting cards up on the wall, but by trying to improve the flow of those cards, you will eventually become agile. Not Agile with a capital “A”, but agile in the sense of being nimble, being able to change with minimal fallout such as unfinished work, wasted money, reduced late night calls, not being required to do death marches and so forth. That is, if you have that goal in mind. Who knows, you may end up developing your own Agile methodology.

Human Error

I have recently finished reading Sidney Dekker’s “The Field Guide To Understanding Human Error” and I found it very interesting.
The book is generally about safety, but I believe the lessons learned could be applied to incident management with IT systems.

He first talks about the “Bad Apples”, this is the view that Human Error exists. One of the causes is by people doing the wrong thing, not trying hard enough or not paying enough attention and missing some significant detail. Sidney calls this the “old view”, which I see as not surprising. We, as humans tend to see our creations as perfect. These especially apply to processes we develop. We develop these complex steps that need to be followed, and when someone makes a mistake, it is their fault for not following the process. It is their fault for not trying hard enough. It is their fault for missing that crucial bit of information that could have prevented the incident.
So when you have this mindset, what is the solution? Reprimand or fire the person who made he mistake! Give the person more training? increase adherence to the process or make the processes stricter? Add more contingencies and paths to handle any situation? Especially in IT (and I’m very guilty of this one) is add more technology?
These steps according to Decker, do not work. They stop learning in their tracks. We blame the person rather thank look at the system as being “imperfect” and trying to fix that. So why do we blame people? Well, simply because it is easy and quick. You did something wrong, therefore it is “your fault”. It saves face, “The process I developed is perfect! You just didn’t follow it correctly”. Basically, it “feels right to blame someone.
The problem with blaming “someone” is that it stops learning. The investigation stops, and your done. Whereas if you continue to investigate, you will see what the underlying problems were that caused the incident in the first place.
In this view, which Dekker dubs the “new view”, human error is a symptom of some underlying problem. Something more systemic. The incident that occurred is but the start of the investigation, not the end.
First things first, you need to assume that when people come to work, they come to do a good job. If you have someone coming to work to cause havoc, then this is something different, but then you can still investigate “why did that person start causing trouble? What drove them to this?” This can be especially important for a long term employee. What drove them to do this? For a short term employee? You may ask, what did we miss with our hiring procedures?

Hindsight bias

One reason that the old view is popular is because of hindsight. You know what the effect of the actions were, because the incident already happened. We know that if you do “chown -R root /*”, you change all files to have root as the owner (OK, I admit that the command could be wrong, I’m not willing to try to verify), completely screwing up the system (yes I have done that early in my career). No it does not save time when trying to change permissions on several directories when you are in a hurry. (At least that is what I think I thought 20 years ago).
The thing is, that when the person is taking the action at the time, they don’t know what will happen. If they did – they would not have done the action that caused the incident in the first place.
What’s more, you don’t know what is going through the persons mind at the time they were performing the action. Could they be concentrating on something else that they deemed important at the time? Could their priorities be elsewhere?
For example, hypothetically, a pilot brings a plane through a storm and crashes (no one injured but lots of damage to the aircraft). Should they have flown through the storm? Obviously not, now that we know the consequences. But lets say that they were already several hours delayed. Their priority was to get the passengers back on time. Had they diverted around the storm, they would have been late and reprimanded. Had the pilots gone through the storm and nothing happened, they would have been heroes to the passengers. At the time that the decision needed to be made – without knowing the consequences, what would you have done?
That is one of the topics in the book – if you were in the same situation, under the same conditions as the people in the incident, could you have done the same thing?
If not – why not? Why are you different?
If on the other hand you could have done the same thing – what would have prevented you from doing it?
Another example Dekker goes through is a certain type of aircraft had a large number of crashes during WWII. Pilots were pulling the lever for the flaps instead of the landing gear. The two levers were near each other. They tried everything. Reprimanding pilots, re-training pilots, but the crashes continued. It wasn’t until an engineer looked at the problem in a different way. What the engineer did was glue little flaps on the levers that were for flaps, and little wheels on the lever for wheels. You see, the pilots would find the levers by touch as their concentration was focused on something else at the time during landing. Adding a tactile indicator for which lever was which pretty much eliminated the crash landings.

Recording Errors

How many of you record incidents? Do you put them into Jira, or some other bug tracker? Do you do anything with the incidents you record? or, do you just fix the immediate issues and move on, only for the incident to occur again and again?
So much so that it becomes part of your work?
Recording incidents, minor ones, even large ones does not fix the problem. Also, given that we work in an environment that is complex – computer software development, it is hard (but not impossible) to make something error free – especially given the demands of the job. But, even if something is error free, there can be circumstances that cause issues. Hardware failures, even scenarios that were never originally thought up. Users do tend to find ways to break things.
In this, Dekker has another analogy. During WWII, planes were being shot up quite a bit. Some planes made it back after a run. Some didn’t. The question was asked, should we add armor to the areas that had holes to prevent the holes in the first place? Armor adds weight and should be used sparingly.

The answer was “No”, you add armor to the areas that didn’t have holes – because they were the ones that made it back. The ones that didn’t make it back were most likely because they had holes where they shouldn’t have been.
So where should you focus your efforts in fixing problems? In the areas that cause you the most pain! Fix those, remove the pain. This should free you up for more valuable work.

Technology

Sometimes, and I’m a big sucker for this one – is that we think that replacing a person with technology will prevent the issue. The problem here is that it may fix the immediate issue, but what about boundary conditions that were never thought of? This could cause a minor issue that a human could resolve into a catastrophic issue. Dekker doesn’t say – don’t automate – but be careful what you automate. Make sure it augments the person rather than replace the person.
Technology is good for repetitive problems, but isn’t so good for changing conditions. Only a human can do that.

Conclusion

Look at the error that someone made as the start of an investigation. Look at what caused the issue in the first place, what state of mind was the person in? What was their incentives at the time? What can be done to alleviate those issues to prevent the problem in the first place. Amazon does this. While browsing Hacker News, I came across an article that goes through a talk by Jeff Bezos on how Amazon looks at incidents that occur.
The thing is that to prevent incidents, you have to rely on the people you have. Make sure they have the knowledge, experience and support to handle incidents. Make sure that they are the ones who figure out how to permanently solve the issue. They are the experts on the issue, because it is part their job. If someone makes a mistake, don’t chastise them, congratulate them for finding a flaw in the process, then look at ways to patch or repair that flaw. This may start a culture where mistakes are not hidden, but made out in the open. When they are out in the open, they can be fixed – permanently. This can only make your organization better. This – in my opinion is the crux of Agile. Finding the problems in the system and fixing them to make you faster, make you better, make you more knowledgeable and make you feel safe enough that you can expose more problems in the system.

 

Edit : Found the article that I referred to and added the link. I found the article on Hacker News, but it refers to a thread on Quora.

A Pratical Example Of Continuous Improvement

This is my example of continuous improvement. At least a start. It is far from perfect, but I would like to think it is a step in the right direction.

I currently do on call as part of my job. Every 4 weeks, we spend a week taking the support phone, doing the daily health checks and fixing up any issues that come up. When I do these health checks, most of the time we are fixing symptoms of problems – not the actual root cause of the problems themselves. There just isn’t enough time to investigate the root cause and fix for everything. I think most people are in the same situation in an Enterprise environment You have legacy systems that have lots of problems and sometimes it feels overwhelming to fix everything. During this support period – I try to pick one thing, one problem, one pain point and try to make it go away.

For example, we had another part of the business regularly request data from us (I work on the integration/middleware side) for messages that were sent outside the business (B2B). That part of the business would then modify those messages to correct data and have us resend. I got sick of this, so I wrote an app that allowed the business to do it them-self. It took about 2 weeks to do the first part, allow them to send the messages when they were ready, then another week to allow them to retrieve the data. That’s 2 weeks while doing my normal job as well. I just “stole” an hour here and there to work on this.
The UI is ugly as, but it allowed the area of the business to rapidly turn around issues rather than wait 2 days for us to get around to getting their data. It also alleviated a burden from the team.

Its quite simple, if you have a manual task that you do on a regular basis, script it to do it automatically. If you have an error that occurs regularly, find its root cause and fix it. You don’t have to do everything all at once.  Schedule an hour or two every couple of days – pretend you are going to a meeting if you cannot make your work visible (because it is frowned upon).  Its not ideal, but sometimes you have to “hide” work in a non understanding environment. The best way though is to make the work visible – you never know if another team member is doing the same thing.
We don’t do Sprints – so our support schedule becomes my unofficial cycle time. The preference is to do this more frequently. If you do sprints – pick something – anything per sprint. If you don’t do sprints, start off by doing something once per month (that is per person), then increase to once per fortnight, once per week, then always have something going on in the background that you work on at least once per day. Even for just 15 min minimum. If a piece of work that would take you  few hours actually takes a few days because you have to balance other work – so be it. Think small – you don’t have to spend months, just do what you can do now. Find a small solution to make it just a little better. Look for ways to build small capabilities, then link those capabilities to make giant leaps. Modularise and componentise.
Look at standardizing processes. Start by doing things manually to work out the steps, then automate. Sometime standardisation just means improving your documentation so that other people in the team can read it. Sometimes it is making sure that everyone shares knowledge.

Slowly, over time, things will get better – especially if everyone in the team/company does it.

This is part of Scrum.
This is part of Lean

This is in my opinion one of the foundations of an Agile Mindset. Learning to see where problems are, learning ways to expose problems and then learning to fix those problems.  Not looking at ways to avoid or hide problems. Hiding and avoiding problems just means you are building Technical Debt that will bite you some time in the future. Most likely at the most inconvenient time.

Also, always keep quality at the forefront. Even if quality dips – and it will, don’t just accept it. Look at ways to improve it. Look at ways to keep quality but speed things up through removing unnecessary work, or automation.

There are cycles in the way we work and live, even if you don’t do Agile. Agile just accelerates those cycles trying to make them smaller and smaller. Use those cycles for learning. Use those cycles for improving.

Work Vs Value

Not all work generates value.
For example “keeping the lights on” work helps maintain value. It does not generate it. In fact, by maintaining the status quo, you may actually be reducing value as the rest of the world moves on. What you do may actually mean less. It may mean quite a lot to the business at the time, but if your competitors are doing things better, cheaper, faster, then in the grand scheme of things, it isn’t much, thus a reduction in value.

You also have work that reduces value. Have you ever worked on a project that was canned, or a feature that is never used? There can be the argument that you are creating “value” as you are producing an output, but if no one uses the feature or the project is canned before anything produced is utilized, you do not get any revenue. Yes I know that value doesn’t necessarily mean revenue, but revenue can be a good measure. If no revenue is generated, the work has a cost, and the output of that cost is lost. The value of that work then is negative.

Sometimes the value can be hard to find. Training your staff on a new technology vs sacking the lot of them and bringing in new people is one way to get a skill quickly for the company, but at what cost?  The knowledge of the people. You loose whatever knowledge those workers had that were sacked. Tacit knowledge that isn’t easily transferred or learned, even if handed over to replacements. You may also gain the reputation as a company that treats workers like throwaway objects, resulting in it being harder to find people in the long term, at least not without offering significant incentives such as a very high salary, and even then you may find it hard. Increased turnaround time as new workers continually have to learn aspects outside their initial core knowledge. Every organisation does things differently. Core knowledge of a language or technology doesn’t mean that they will be able to hit the ground running straight away.

Value is a finicky thing. You know it when you see it, but if you have never seen it, it’s hard to know what it is.

In order for your work to generate value, you have to be constantly vigilant. You have to be constantly finding ways to reduce the work that doesn’t bring benefits, the time and money sucking work. Not only that, but you have to do it in a constantly changing environment. Work that generated value at one point in time, may become a burden the next. That work may be necessary, but you need to constantly find ways to reduce or eliminate that work. With the time left over, you need to find ways to increase value, which in turn generates more work. It is a never ending cycle.

So, how do you know if your work is generating or sucking value? IMO that is through measuring the outcome. The sooner you release a piece of work the sooner you can determine if it has produced value. Then you can decide whether or not to continue enhancing that piece of work, or throwing it away and loosing the cost of doing that work, and since the output of the work was measured at an early stage, that loss becomes minimal.
For the “Keeping the lights on” work, what happens if it is stopped? Does it stop the business from functioning? Does it reduce the value of the output? If not, keep it going, but look at ways to keep the same “value” at a reduced cost in both time and money. Notice I said value – not output. For example, automating a workaround to a bug that was previously done manually keeps the same work output, its just done quicker. Fixing the root cause of the bug changes the output of the work by eliminating the work, but the value (keeping the system running) remains the same in both cases.

Work has a life cycle, sometimes that cycle can be minutes, sometimes it can be years. To determine the at what stage the the work is in its life cycle, you need to look at its value. You need constantly evaluate the work. constantly check it. Can it be reduced, changed or eliminated to give value the same or more value?

Without Fear…

I don’t know where this came from. Whether I wrote it or took it from somewhere – a google search got me nothing, but I found it in my notes and I thought it was quite profound.

With fear there is no honesty

Without honesty there is no trust

Without trust there is no openness. 

Without openness, there is no respect.

Without respect there is no commitment.

Without commitment, there is no responsibility.

Without responsibility, there is no results

If you know the source, if there is one, please let me know otherwise I’ll claim credit :-).

Deviant Behaviour

Recently in Australia, we have just had the results of a Royal Commission into the Banking sector with damming results.

For those that don’t know, a Royal Commission is an significant investigation carried out by the government. This one found abhorrent behaviour by the banks in order to make a buck. The thing is, that this was across the board. All major banks in Australia were involved.

Although no one went to gaol, (jail for those americans) at least one CEO had to resign.

So, what happened? Well, to put it simply, the bad behaviour became normalized. It got so bad, that the government had to step in.

I’ve just finished reading a book by Sidney Dekker, The Field Guide to Human Error, and I’ve listened to a couple of talks where he mentions the Normalisation of Deviance.

This is where there may be a slight deviation from normal behaviour. Deviation is normal, we have to adapt to changing conditions, the problem comes when that deviation is towards something not right. This is what happened with the Banking sector. A long time ago, being a banker was a respectable job. Being a bank manager was a position of prestige and respect. Then, something changed. My thoughts are that this happened in the 80s, but that is my own opinion. That change was to make significantly more money for the bank – or for ones own gain, through not so ethical means. That unethical act, though small was accepted. It could have been a sign of the times, who knows. But it was accepted.

Other people seeing the benefits decided to do the same thing. At that point, that act was now “normal” behaviour.

Since that act is ok, why not another act, then another, then another. Slowly, over time, behaviour that previously was unacceptable became acceptable, even encouraged till – POW! A line is crossed and things are really really bad.

This analogy of normalisation of deviation works for acceptance of errors.

Lets say like what happens in a number of projects, no Sev1 or Sev2 errors are allowed, but we allow 3 Sev 3 errors to get the project into production as quick as possible. Then the next project comes along. Well, things went well with 3 Sev 3 errors last time, why not 6. Then the next one after that, 12. These cause problems for the operations team, but they can fix the problems on a daily basis (the symptoms – not the underlying error). Project after project adds more issues. Sounds a bit silly, but this happens everywhere that errors are accepted. Not immediately, but slowly over time. Years or decades. Sometimes the problems get fixed – sometimes you live with them until the problems become more trouble than what they are worth then you have a little pow! Where you create a project to fix some of these problems. Maybe through a re-write. Maybe a replacement of technology. If things are really bad – then you have a major outage with significant down time. On the other hand, you may never see these issues – or you think you may never see any issues.

The thing is – when you are in the bubble, where behaviour is acceptable, you do not see the POW come until it is too late. That is if it does come. Why, because the precursors for the problem are through normal work. Normal acceptable behaviour that everyone in the bubble does.

So what is the solution? In my opinion, it is to have a goal of what is acceptable behaviour, what is an acceptable condition. A condition or behaviour that is so excellent, so noble, that it is unattainable, and if it is attained, then make it higher. Accept that deviation will occur, but don’t normalize it unless it brings you closer to that higher plane. Systems degrade because things change, entropy enters the system, because of those changes and we as humans have to change with it. What that higher goal gives us is a direction. A direction beyond our on self interest. A goal to strive for. That is how I see we can veer away from trouble.

With the banks, the goal was to make money. Excellence was secondary if thought of at all. Yes they were successful, some even got away with it. They made money, but at what price? Reputation, possible gaol time? Respect? Was it worth it? It seemed like it was right up until the end.
They never saw it coming.

Agile is a tool that can help with keeping you towards a goal – if done properly. With an ideal goal at hand, Agile helps as you pause every sprint – see how you are going towards your goal state then adjust towards that goal. Sometimes you veer away, but if you look – you can see it and make and adjustment. On the other hand, if you do Agile without a goal state in mind, then compromises are made. Those compromises then become normal, and then more compromises are made again and again then no amount of following Agile practices will help you. Things may seem good at first, they may even seem good for a while, but the question is – are they really? How do you know when you are in the bubble? Even if you have a goal, how do you know it is the right goal?

The thing is you don’t – you have to be continuously vigilant. You have to keep striving to be better – to do what is right.