Reversibility and Low Impact Change are the non-functional requirements for Agile Technology Change

Reversability

I have attended and chaired many Change Review Boards for large organisations for the introduction of new systems and changing exsiting systems. I also overseen many development programmes, taken difficult decisions and helped stakeholders make descisions. The factors I’ve seen that affect the speed of the change is the “reversability” and “impact” of the change.

The reversability of the change is the ease (time, risk and cost) with which a change can be reversed (along with any side effects). The impact of the change is how will it affect the organisation, the customers, the shareholders, the public and the environment. This impact could be financial, reputational, legal and is some casues it can real, perceived or imagined.

For the most part software code is the easiest to change back (or rollback), data is harder and infrastructure very hard. Front end software is generally easier than back end software to roll back because of the dependancies involved.
Changes which cause permenant changes to databases how ever require extensive testing to ensure the change will perform as expected in addition to backups to restore the original state which may incur costs and lost time. Equally the most damaging changes related to those which affect upstream and downstream systems which cannot be restored to a previous state and so parallel running is advisable which is the most expensive of the solutions.

When decided how to authorise change I used the reversable/impact matrix to help decide the ease with which the change can be acheived. The easily reversible low impact change is easily dispatched if the asessment looks reasonable. The high impact easily reversible change requires more planning and techniques (Blue Green Deployment for one) in the implementation to make sure it creeps out rather than floods out. The low impact hard to reverse change needs robust challenge over the level of impact the change will actually incurr. The high impact and hard to reverse changed needs to most preparation in terms of regression testing, swing environments or even parallel running.

Therefore low impact and highly reversible change is the perfect combination for agile delivery which is sometimes termed ‘fail fast’ (which I usually change to ‘fail fast and safely’ to reassure stakeholders).

Cloud technologies, containerisation, orchestration and schema-less databases are so popular amongst architectd and devleopers as they create environments that are easy to create, easy to test, easy to change and all with the mimimum of impact.

GDPR and the Streetlight effect

There is a lot of heated debated about the GDPR (General Data Protection Regulation) – is it a Y2K damp squib or PPI/Libor/Forex all over again?

The General Data Protection Regulation is more certainty much more than Information Security and Privacy-by-Design which are just table stakes in the future.

The Regulation covers:

  • the right to be informed of personally identifiable information gathered and processed
  • the right of access to personally identifiable information
  • the right to rectification of mistakes
  • the right to erasure of personally identifiable information
  • the right to restrict processing personally identifiable information
  • the right to data portability to a new supplier
  • the right to object to marketing, scientific and historical research
  • the right to not be subject to automated decision making and profiling.

I agree that the maximum fine will probably never be levied but the range is sufficient to put it in the same impact levels of the fines that banks have suffered in the last decade. The fine will be based on the ability to pay and the perceived damage caused by a breach. There is also threat of private civil action by those individuals who can claim damages for breaches of the regulation.

The challenge for technologists is to provide compliant holistic solutions that meet the very inexact wording of the regulation around what are reasonable measures. These requirements themselves will be driven by newly formed governance bodies, internal and external compliance, legal counsel, marketers, external suppliers and data processors. A technology programme with a multitude of stakeholders and vague requirements is seldom a success.

The personally identifiable information itself is usually scattered among a number of front office, back office and supplier systems usually without good links to relate the information. There is also the risk that the data resides in the Shadow IT of the organisation away from the control of the technology department.

There are also technical challenges in the solutions proposed, for example, the regulation says that data must be encrypted but as any infosec expert will attest encryption is only as good as the algorithm and where you store the encryption keys. As for the right of erasure, that will be the topic of a future post on the technical challenges of achieving that requirement.

I applaud vendors such as Microsoft this week certifying their solutions can support making an organisation GDPR compliant but that shouldn’t take the focus off the overall organisational and technological change required.

At the moment, it seems we suffer from the streetlight effect in only looking for problems that are perceivably the easiest to fix.

GDPR – Federated Search Technology Pattern

The EU General Data Protection Regulation requires that European Citizen’s personally identifiable information is controlled, secured, available-on-request and able to be erased by 25 May 2018. Additionally customers have the right to easily change their marketing contact preferences within a reasonable time. For most organisations this is a challenge when they are dealing with legacy systems with departmental controls and the only data warehouse is provisioned for analytical purposes only.

One way of achieving compliance is to create a enterprise wide Data Lake containing all the organisations customer information sourced from different business unit’s operational systems and core systems such as email, intranet and shared drives. The Data Lake also contains all the original systems meta-data (data about the data) plus provenance information such as back-links to the systems of record in the individual business units. These back-links to the operational systems allow the right to be deleted to be exercised if requested by a customer. The meta-data contains the creation and last accessed dates from the system of record along with security information to allow the correct access controls to be applied to the Data Lake.

Ensuring a single view of the customer is much easier if all the enterprise data is within the Data Lake including third party SaaS (Software-as-a-service) providers such as Salesforce or Microsoft Dynamics. The addition of Master Data Management technologies from Informatica or IBM can provide data cleansing either before the data reaches the Data Lake or during a Search on the Data Lake. The Data Lake model also allows enterprises to resolve the issues around marketing contact preferences which can be
ifferent in each customer relationship management system or account. Allowing the customer to change their contact status or contact channels becomes easier if they can be found in the Data Lake.

However creating Data Lakes can be very challenging if the security or data models are heavily embedded within the operational systems or local jurisdictional systems have to be used for access control and monitoring.

The alternative model is Federated Search which for some organisations is a better solution. The Federated Search also allows the minimal amount of sharing as it uses a ‘merge on query’ approach to inter-departmental data which allows potentially greater compliance with ‘privacy-by-design’ constraints on the systems. Additionally a Federated Search can cross the organisational boundary into external data processor’s systems in real time.

The Federated Search model requires each departmental operational system to provide a full text search index either re-using an existing index technology or deploying a bespoke search capability using, for instance, Apache Solr or ElasticSearch. An Enterprise Search Service provides a central service or portal from which queries can be made. The Enterprise Search Services cascades any customer lookup queries down to departmental federated query engines which then searches for the data in their local index. If customer data is found then a specific query is constructed on the operational system. The returned data is correlated, matched and linked to provide a single view of the customers data. The actual search uses the security credentials of the user requesting the information of the Enterprise Search Services so the security controls and logging are preserved. In addition the business unit data owner retains real-time control over access to the data and can see the data access patterns within their existing context. Another benefit is the local index protects the operational system from unexpected load or logging as the resulting queries from the federated queries engines can be optimised for extracting specific information and not searching.

From a delivery perspective the Federated Search option means the organisation is not running a big programme at the centre of the organisation with the issues of communication, governance, funding and additional dependencies on an already stretched enterprise. The individual business units have the freedom to define the indexing technologies and subsequent queries and only need to comply with a well defined API for data query and security authentication and authorisation information. The system owner for the Active Directory (or equivalent identity and access management service) is not required to implement consolidation of permissions from various systems. The Security Operations Centre does not need to take on new feeds from a new system and try to correlate them with the existing operational system to determine access patterns.

The central technology programme is therefore responsible for defining the Customer Search API, the Federation Services and the API definitions alongside an on-boarding plan which can meet the speed of the overall organisation and the individual business units.

Data Lakes are a powerful technology for organisation to deploy, however with the impending deadline for GDPR compliance (25 May 2018) looming some organisations may need to take a more expedient approach.

For more information about GDPR, please see the UK Information Comissioner’s Office website: https://ico.org.uk/for-organisations/data-protection-reform/guidance-what-to-expect-and-when/

I encourage you to watch the video and provide feedback in the comments for suggestions, improvements, alternative approaches and critique.

General Data Protection Regulation – Right to be informed – Technology Strategy

The Right to be Informed (about processing) under the General Data Protection Regulations relates to information gathered either directly from the individual, through their interactions with the organisation or via third parties.

The technology challenge with such a right is making sure that the correct individual is identified and informed about the processing. Even the process of informing the individual is fraught if that individual has already notified the organisation they do not want to be contacted. From a legacy systems viewpoint good search, matching and linking is required to ensure the right individual is contacted and they have given prior consent to communication.

Organisations that have implemented a CRM (customer relationship management) system (such as SalesForce, Dynamics CRM or Siebel) to provide a ‘single view of the customer’ will be able to use Master Data Management Technologies (such IBM InfoSphere or Informatica PowerCenter) to provide match and linking for information related to individuals to enrich the CRM with consent information. The consent information should be evidentially recorded via a secure timestamp and cryptographic hash and access provided to those systems that need to enforce compliance (via a RESTful interface using a secure hash as a key).

Tactics for compliance include scanning and processing paper forms into an indexed evidential document store which provides a (RESTful) interface for providing compliance information. When new information is obtained about the individual existing processing and data consent will need to revalidated and the individual informed of the change. Typically I have seen ElasticSearch used for search indexing using an Enterprise Content Management System (Alfresco Digital Business Platform or IBM FileNet) for storing the documents and the results stored in the CRM system synchronised with a evidential status store based on NoSQL or SQL technology.

The right also conveys the need to inform the individual within a reasonable time and in a manner acceptable to the individual: easily understood and convenient. This could be post, e-mail, SMS or notification depending on the client’s needs especially to comply with accessibility requirements. These facilities all need gateway technologies such as print, email/SMS (Twilio for example) to ensure the individual can be contacted.

From a open source viewpoint ElasticSearch (for search), Apache NiFi (for search and linking) and Apache HDFS (for document storage) and the database of your choice (MySQL, Postgres, MongoDB or HBase are on my shortlist). Adding an API Gateway (Mule, Knox or Kong) provides other systems with the consent information which can trigger notification to the user when the new information is recorded or processing changes (using Event Stream Processing using Apache Storm or via AWS Lambda/Azure Functions if cloud based).

Overall the strategy to support this right alludes to an event driven architecture as the strategic way forward so new information being collected or obtained triggers an evaluation of the change which in turn triggers notifications to the individual. Whether they are spontaneous single actions or batched into meaningful, yet still timely, notifications the event driven approach still stands.

 

General Data Protection Regulation – Technology Strategy

The GDPR (General Data Protection Regulations) are part of the evolving legal landscape as digital transformation changes society’s interaction with organisations which come into force in the UK on 25 May 2018. Regardless of Brexit if an UK organisation wishes to process data about individuals from Europe the regulations need to be followed.

The regulation forms the policy around privacy-by-design to complement the security-by-design which feeds into an organisations technology strategy.

In summary the GDPR adds:

  1. The right to be informed (of the use of data)
  2. The right of access
  3. The right to rectification (of errors)
  4. The right to erasure (to be forgotten)
  5. The right to restrict processing
  6. The right to data portability (to move to a different supplier)
  7. The right to object (to processing)
  8. Rights in relation to automated decision making and profiling.

Like the original UK Data Protection Act, the GDPR aims to make sure that information is used appropriately, accurately and not excessively by any organisation. The change is not only the provision of new rights but an new enforcement regime with much stiffer penalties so the trade-off of investment vs. penalty is skewed towards more investment. For some organisations the penalty is an existential threat to the company and definitely something board will care about and the shareholders will take notice in the annual report.

My experience of non-compliant organisations under the Data Protection Act as a customer remind me of the need for such protections. A financial institution tried to derive my financial profile from my banking records not recognising the fact that I have multiple bank accounts for different purposes and the individual accounts gave a poor perception of my overall financial needs. Similarly a credit card company assumed I was a bad risk due to an error in my employment status and so raised by interest rate only to plead me to stay when I corrected the error, cleared the balance and cancelled the card.

As a former data protection controller I understand the pressure of balancing the needs of the business against the regulatory framework and so I offer some technology strategy solutions to meet these needs.

From a technology strategy perspective there are a number of technologies and design patterns than provide solutions to each provision under the GDPR. Organisations with legacy infrastructure will be challenged to comply with each unless they take remedial action through tactical fixes followed by strategic change.

In this series I will cover the eight rights and offer tactics and strategies to comply with each in the form of technologies and design patterns.

For more please see the UK Information Commissioner’s Office guidance: https://ico.org.uk/for-organisations/data-protection-reform/overview-of-the-gdpr/

Technology Security Lamination

The key theme for protecting IT systems from unauthorised access is to offer multiple layers of protection in terms of people, technology and physical environment. This is known as defence-in-depth when referring to technology, separation-of-concerns when referring to people and compartmentalization when referring to the physical environment. Ultimately all these techniques resemble lamination as applied to bulletproof glass or car windshields where multiple layers of different materials are used each with different physical characteristics of strength, hardness and brittleness which are stronger in a composite form.

Defence in depth is the use of different technologies to offer layers of protection which protect some aspect of the system such as vulnerable protocols, incorrect content, hard to resolve bugs or enabling a single person to compromise the entire system. Such technologies are firewalls (web, layer-7 or network), DMZs (network zones between two firewalls), content inspecting proxies (anti-malware and data loss prevention) and Virtual Private Networks. They work with IDAM (Identity and Access Management) solutions which also include authorisation, authentication, auditing and logging.

Separation of concerns for people involves the dividing roles between developers, administrators (including DevOps) and security operations staff. The developers design and write code which is turn deployed and managed by administrations and the previous two roles are monitoring by security operations staff. The logging and monitoring provided by the IDAM solutions needs de-duplication, correlation and analysis for behavioural changes to avoid intrusion and infiltration by new techniques (zero day threats) which emerge or are deliberately engineered to overcome the defence in depth solutions. The people aspects of security need a strong security policy, good background checks on candidates, regular training against social engineering and a culture of continuous improvement.

Physical security is still very important aspect of the overall security defence regime and technologies such as strong encryption only mitigates the risks and does not end them. Strongrooms, multiple doors, biometric security, multiple physical sites, CCTV, intrusion detection alarms and in some cases TEMPEST/SCIF techniques are the foundation of good information management. Managing and monitoring physical security in the same way as digital security is paramount to achieving the right level of control. Cloud environments enable the shift some of this responsibility to third parties for Storage, Compute, Network and Backup who are constantly improving through achieving ISO27001, PCI-DSS, HIPAA and FedRAMP certification which benefits all their customers.

Technology security lamination using different technologies/techniques/people at each layer provides the best approach to meet the challenge of continual improvement in the arms race that is Cyber-Security.