Last programming language? For now, maybe

In a couple of posts on Clean Coder blog dating back to 2016, Uncle Bob Martin (one of the founders of software craftsmanship movement) explores productivity gains delivered by programming languages. He says that the first programming language (assembler, of course) increased productivity ~10 times compared to coding in binary. However, by now productivity gains from newer programming languages have dwindled. He makes the case that development of programming languages has followed logarithmic curve and we’re approaching asymptote where additional efforts to develop new languages and frameworks bring smaller and smaller productivity improvements. Uncle Bob claimed:

“No future language will give us 50%, or 20%, or even 10% reduction in workload over current languages”.

I agree with Uncle Bob in the short term, but disagree in the long term.

Programming is a craft of translating human intent into machine code. Improvements in our craft come when we make this translation easier, which is achieved by communicating to computers on increasingly high level, using ever higher abstractions. In other words, we develop tools that allow the interface between computers and humans to progressively shift from pure machine code to more human interaction forms. Programming for ENIAC was done by plugboard wiring. Later, assembly language was invented as an abstraction above hardware. C is effectively a layer above it. Java is higher yet as it runs in JVM and abstracts away memory management. Development tools are also productivity enhancers, as all tools are. Content assist makes writing code faster, and so do refactoring wizards.

From this perspective, it’s clear that we have a very long journey ahead of us, with the ultimate goal we can’t even clearly see. Hopefully at some point humans would be able to seamlessly augment their brains with computer power and just think with the aid of their computers. That is the ultimate asymptote. On the way from here to the ultimate simplicity, there are many more milestones and many more programming languages. The tool for programming based on brain-to-computer interface will be much more productive than what we use today.

Of course, invention of direct brain to computer interface will not obviate the need for general programming languages or for programmers. If anything, there will always exist devices with highly constrained computing power – what we today call embedded devices. They will exist no matter how much computing power increases, simply because resources will always be scarce and there will be value in providing some computing capabilities in the devices that are too constrained for full blown “standard” runtimes. But it doesn’t mean that programming for the future embedded devices will stay primitive. Even today, embedded programming is not done in machine codes, nor in assembly. IDEs for the embedded programming could be future-modern, incorporating all of the productivity features that are yet to be invented.

We’re not approaching the final asymptote in programming efficiency, we’ve just reached a plateau of sorts. Why the plateau? Because hardware resources stagnated. Throughout history, practical improvements in programming tools and resulting productivity gains have been linked to availability of hardware resources. Compilers arrived when computing resources became cheap enough to spend some on making programs. The switch from long cycle of writing code by hand and running it from punch cards to real time coding came with abundant CPU power and leaps in I/O technology. Eclipse or IntelliJ IDEs were impossible on 16KB of memory and punch card reader.

But wait – there is a computing resource availability of which has recently improved: TPUs (Tensor Processing Units), usually cloud-based. They have enabled adoption of Machine Learning as a practical discipline. And it has delivered improvements in the efficiency of software projects – on the large scale, by enabling techniques such as image recognition. Some problems that were intractable a decade ago are solved today using machine learning. Some others were barely possible and required a lot of effort to code. ML is not a general purpose programming tool, but it is a productive tool for software development.

This brings up another, older Uncle Bob’s pronouncement. In his famous “Last Programming Language” speech, Uncle Bob noted that we seem to be going in circles between structural, object oriented and functional programming and posited that these 3 paradigms exhaust the all the possible options. But some programming problems can be solved without writing code. Practical Machine Learning enabled computers to tell cats from dogs and recognize human faces. This could not be done reliably with either structural, object-oriented or functional languages.

“Trying to manually program a computer to be clever is far more difficult than programming the computer to learn from experiences”,

said Greg Corrado, senior research scientist at Google AI. True, the context of his quote was a bit different (he was arguing that Machine Learning is the pathway to Artificial Intelligence), but the point is still valid: ML unlocked automation that was previously impossible. Wouldn’t that qualify it as an alternative to the 3 classical categories of general purpose programming languages?

What’s next? I can’t wait to see how coding could change with VR headsets and gesture sensing input devices. They could create new possibilities and unlock completely new paradigms. Happy coding!


Disclaimer: opinions expressed in this post are strictly mine and not of my employer.  

Advertisements

What is Brian Krebs’ True Name?

I’ve recently heard a talk by Brian Krebs, an investigative journalist specializing in cybercrime. The most interesting part of his talk to me was the one in which about his OpSec. It struck me how much it resembles the True Names story.

In Brian Krebs’ case, it is critically important to him that bad actors that he has contact with inline can’t track him down and physically attack him. To achieve that, he goes to great lengths to ensure his physical home address is not connected to any of the online accounts. This means not having a title or mortgage in his name, no deliveries in his name to his address etc. Setting all of this up is not a trivial matter, obviously, but such is the life of a warlock.  

True Names is Vernor Vinge’s early 1980s novella that is regarded as the founding work of cyberpunk genre and the first description of cyberspace. In that book, the protagonist runs afoul of government busybodies in the cyberspace. He can play all kinds of games online and can stay safe as long as his adversaries don’t figure out his real life identity (true name). Once they do,they can simply visit his home  and voice their demands.

As you can see, the situations are similar, although there is a twist. In Krebs’ case, his real name is known, so his last line of defense moves from the name to location. When your adversaries aren’t your own government, jumping from name to address is not necessarily trivial. For someone concerned about doxxing, there is a good lesson here.  

Half SQL: semi-opaque tables reduce developer effort

Your relational data design could steal a trick from NoSQL: database representation of your application objects doesn’t need to be broken down to primitive values. Some opacity of relational data is beneficial.

Relational databases impose rigid structure on the data: data is organized in tables, which are made up of columns. Traditionally the columns could only be of primitive types (numeric or alpha), although opaque types (BLOB, CLOB) and XML are now universally available. The rigid structure makes it easy to reason about data, to guarantee data quality in the database layer and to build tooling on top of the database. Querying based on values of individual columns is well defined and highly efficient. Downstream usage (reporting, analysis) becomes easier. Relational structure works well when you need referential integrity across entities that are not updated together.

But relational databases have downsides. One of the downsides is the notorious difficulty of changing the database structure. Martin Fowler wrote: “A lot of effort in application development is tied up in working with relational databases. Although Object/ Relational Mapping frameworks have eased the load, the database is still a significant source of developer hours”. Guy Harrison blogging for Tech Republic: “Change management is a big headache for large production RDBMS. Even minor changes to the data model of an RDBMS have to be carefully managed and may necessitate downtime or reduced service levels“. There is “impedance mismatch”  between RDBMS and application layer: “Application developers have been frustrated with the impedance mismatch between the relational data structures and the in-memory data structures of the application”. Even a change confined to a single table (e.g. adding a column) requires significant effort and synchronizing rollout of database and application layers.

The frustration with the amount of developers’ effort that the relational databases required was one of the drivers behind the rise of NoSQL starting about a decade ago. NoSQL databases differ from the relational in many other ways and switching to NoSQL is not an easy step. Fortunately, you can solve some of your problems by using just one item from the NoSQL bag of tricks. You can greatly reduce the impact of single-table changes (such as adding a new column, the most frequent type of change) by making your table definition semi-opaque. Don’t throw away your old and trusty RDBMS. Turn it into a Half SQL data store: in each table, select a small number of key columns that may be used in indexes and keep them. Hide all other fields from RDBMS by placing them into an opaque container field of a BLOB type. As a simplified example, Orders table may look like this:

order_table

Your application will be responsible for serializing and deserializing those blobs. Adding a new object field will be invisible to the RDBMS. When you need to add a new field, you will only need to change the code in the serializer/deserializer. And if you use a good serialization library (if your application is written in Java, please, don’t use built in serialization; there are many libraries that are faster and more flexible), even those changes in most cases will be NOOP because your library will take care of those automatically. No data migration will be needed. You will be able to write test to verify that your logic works before and after the change. And you retain the RDBMS goodness of referential integrity and super-fast queries over the indexed columns.

Stashing all “other” object fields into a BLOB column could save you quite a bit of effort.

Machine Learning and Artificial Intelligence

There is a popular line of thinking about Artificial Intelligence (AI) that says that Machine Learning is the pathway to AI. This idea gained momentum partially because Machine Learning is the hot topic (and buzzword of the decade), but also because we understand ML fairly well. With ML in hand, it is easier to make machines that can learn and become smart, while making machine smart out of the box is a much more difficult task.

This approach makes sense. After all, the only intelligence that we know – human – is learned, not built. In other words, even if we were to invent Asimov-style positronic brains, robots will still need a period of learning after being fully assembled. Of course robots will need to learn a whole lot faster than humans – nobody wants a robot that takes 18 years to become functional. Fortunately, robots can learn much faster than humans. But to learn fast, robots need a lot of input to process. It means that they will have to learn from shared information (as opposed to the personal experiences alone). To give a hypothetical example, a robot would become a good driver much faster if it can learn not just from its own driving history, like all humans do, but from the experience of all the other robot drivers. Robots will need to remain in constant communication with each other (whether centralized or decentralized) in order to keep up with all the changes in the world around them and get better at whatever they’re doing.

Now, how will ML help us achieve AI? By itself, ML is just machine code capable of syntactic manipulation based on certain statistical rules, which is not intelligence of any kind, not even a weak AI. Will it lead a computer system to developing intelligence? The Chinese Room argument says it’s impossible to jump from syntactic rules to semantic understanding, so there is still a big phase transition to be made before AI becomes a reality. Patience, please.

Wer wartet mit Besonnenheit

Der wird belohnt zur rechten Zeit

(Roughly: Whoever waits patiently, will be rewarded at the right time. © Rammstein).

Disclaimer: opinions expressed in this post are strictly my own – not of my employer.

Please NO: Apple patents camera-blocking technology.

Apple patented technology that would allow blocking the use of cameras in iPhones wherever it is deployed. Who is it made for? Certainly not consumers! How many iPhone buyers ever said: “I wish they disabled my camera here”?

In a disingenuous propaganda move, Apple described  this technology as being able to “…stop smartphone cameras being used at concerts”. Right. It’s about concerts. It can’t possibly be meant to be used by corporate or government security. And I’m sure no police department would think of wiring this device to a squad car flashers, to prevent bystanders from filming police work. It’s strictly about concerts. Nothing to see and record here, move along.

United’s poor “multi-factor authentication”

United Airlines (united.com) recently “upgraded” their Web site security. They sensibly discontinued 4-digit PIN logins and require a password of at least 8 characters – standard practice these days. It would’ve been a reasonable change, if they didn’t leave a loophole one can fly an airliner through.

As a compliment to stronger passwords, united.com also required account holders to set up “secret questions”. Leaving aside the question whether this is a good security measure in general, United’s implementation is recklessly poor. A user can’t enter their own answer – one must select from a small list of curated items. For the question “What color was the home you grew up in?”, there are 12 choices available. “What is your favorite cold weather activity?” gives you 23 options. Those are low numbers – but it gets worse! When trying to reset a password, a user will be presented with 2 questions – and only 10 choices to select from for each question!

United reset password fruit 1-10

So you only need to guess 1 out of 10 twice – and you are in.

This is not extra security, this is security theater. But of course in air travel, security theater is the norm (great job, TSA!)

Are True Names in Earthsea digital encryption keys?

I’ve recently reread Earthsea, a classic fantasy novel by Ursula LeGuin. The novel is set in a magic world and tracks the progress of Ged, a mage with unprecedented powers, from his childhood to adulthood. Keeping in mind an often cited quote by another classic writer, Arthur C Clarke: “Any sufficiently advanced technology is indistinguishable from magic”, I decided to peel back the curtain of magic and imagine what practical inventions may stand behind the magical concepts of the Earthsea world. I’m most interested in the concept of true names. I will show how true names could relate to modern day digital encryption – with a mage’s true name being his personal digital certificate.

Some of the technology behind the magic in the book is trivially easy to see. For example, when Ged applied a spell to a brackish spring on a tiny sea island, it means he’s installed a filtration system. Water desalination was a plausible alternative, but, since it is more complex and bulky (using modern technology), it would be less likely.

There are many cases when figuring out the exact technical solution is impractical because it is described in magical terms. Now, any technology would look like magic to people who don’t understand it. They would simply have no concepts to explain it and wouldn’t know where to look for clues. So naturally people in medievalist societies where magic fantasies are usually set can’t provide adequate descriptions of advanced technology they mistake for magic. Hence the readers lack cues to accurately translate the magic to the modern day technical concepts. This is certainly by design.

With this, let’s get to true names, the most creative and crucially important device in this book. Many entities in the Earthsea universe have secret true names, which have special magic meaning. Here’s what we know about true names:

  • Humans, other live beings and geographical objects (seas and islands) have true names.
  • Each human has a unique true name.
  • True name is permanent and can’t be changed.
  • Animals have one true name for the whole specie.
  • Ged’s true name was assigned to him by a mage when he was near puberty. He didn’t have a true name before that. Many other people aged from 14 and up also have true names. No human person other than child Ged is specifically described as having no true name.
  • True names can be used by beings with magical powers (human mages and dragons) to control behavior of the name bearer

What we don’t know:

  • Do human-made objects (such as houses and boats) have true names? We know that spells are routinely applied to them, but it’s not clear if such spells require knowing the true name of the object.
  • Do all humans get their true names assigned by mages in their teen years? There is not enough information is Earthsea to draw conclusions (and I didn’t read sequels to Earthsea).

Based on what we know, true names must mean different things for humans, animals and inanimate objects. This would explain the fact that human true names are individual, but animals of the same specie share the same true name. We also know that there are several kinds of magic, taught by different Master Mages. It stands to reason that if true names unlock different kinds of magic, then the true names themselves may be of different types. Let’s start with animals, which seem to be the easiest case. For animals, the logical true name would be a DNA signature. The magic it enables is, then, biotech-based.

For humans, we could also go with personally targeted bio. One example of this is found in Vernor Vinge’s book Rainbows End. The book is mostly about near term advanced digital technology and augmented reality. At one point Alice Gu is incapacitated by a virus targeted to her individually. The creation of the virus required collection of her DNA sample.

There are other ways to individually target humans. The simplest one is an identifier issued by a competent authority, something like a Social Security Number in US. And it doesn’t have to be a number – it could be just a name. After all, only a few years after Earthsea, Ursula LeGuin uses this idea in her book Dispossessed – individual names are unique and assigned on planet-wide basis.

In this case, determining someone’s true name would be roughly equivalent to modern day identity theft. We know that identity theft enables the thief to take over the victim’s accounts, disrupting his ability to communicate and access to finances. It can be really demoralizing, which matches the effect of an adversary knowing your true name (rendering you unable to defend yourself). For this to work, the target has to be quite advanced and sophisticated: medieval subsistence farmers won’t suffer much from identity theft, because they don’t bank or use Skype. From the book, it’s not clear whether revealing of the true name is equally damaging to different people. We know that it has debilitating effect on mages, but they are a unique cast living by special rules. We have to assume that they are the super-sophisticated hi-tech elite of the Earthsea, so they could indeed suffer from identity theft.

Another possibility is a breach of anonymity. In one of the first Sci-Fi books of the digital era, True Names, Vernor Vinge shows how cyberspace character is in grave danger if his real world identity is exposed. Thus, true name is simply the real name in the physical world. In the True Names, the danger comes from the government, which is able to apply its heavy hand to coerce desired behavior in cyberspace. But “we know where you live” has also been an effective threat from non-government actors, such as mafia, KKK and jihadists. In the Internet era this threat has increased in poignancy and acquired a new name: doxing.

Finally, here’s the tantalizing possibility: true name could be a person’s private encryption key. Such a key is necessary to create digital signatures proving authenticity of communications. And it must be kept secret, otherwise an attacker could impersonate the victim. Again, this will only work against hi-tech targets dependent on electronic gadgets and communication services. If you are a mage deploying an array of electronic devices, sensors, weapons and so on, a lost private key means that your adversary can now impersonate you in all digital communications. He can issue commands that will appear to be coming from you. He can control all of your devices and turn your weapons against you, which must spell quick and immediate defeat. The threat is made more severe by the fact that in the Earthsea, the key (true name) can’t be changed. We can see the impact of this oversight in the book, where once someone’s true name is known, that person (or dragon) remains in danger forever. An ability to change keys is a fundamental tenet of modern crypto systems: if keys are stolen, changing the keys (think passwords) allows a quick closing of the security breach. Changing keys is so important that some security experts advise against biometric security (e.g. fingerprints or retina scans) because the secrets are unchangeable. Once your fingerprints make it into the wrong hands, you are as helpless as a wizard of Earthsea whose true name has been revealed.

If true names can be understood in terms of  digital encryption, then a book on the current FBI vs Apple encryption fight was written half a century ago.