Visual Programming & AI: is there a future?

14 min readAug 18, 2023

It begins with cave dwellers, our ancestors, the original programmers. They that drew images and depictions on the walls of their caves to teach and pass on wisdom to others. These images taught how to hunt the animals, how to feed from those animals, and more importantly how to survive the environment around those caves.

These cave paintings were created primarily to school the young, to prepare them for hunting and interacting with the animals which surrounded them. Additionally, these depictions might well have been migrational habits of the animals that were expected to travel through the lands around those caves.

This imagery of animals became a visual attempt to influence and program others, passing on wisdom and experience. This process represented an early type of human-to-human programming, influencing others with our actions and words.[¹]

(Canonical version of this article)

(Image Source: Wikipedia, Clemens Schmillen)

Human-to-human programming

Programming is the transportation of information. Programming is getting others to do something that the programmer would like them to do. It comes naturally to us humans, we call it communication and rarely even reflect upon the purpose of our communications, we habitually communicate our ideas and thoughts.

Cave paintings provide a instrument of transporting ideas and programming the viewer. Programming humans by communication continues on throughout our evolution and becomes diversified with our tool development. Sitting around the campfire telling stories is another instrument of transmitting information and knowledge by the means of stories. This human-to-human programming came with the discovery of fire. In those stories that survived, having been passed on from generation to generation, the pool wisdom forms the basis for our collective mythology.

A third human-to-human programming form is writing, which once it was invented became the basis for transporting influence by means of poetry, stories, books, newspapers and the Internet. The written word became a dominant tool for transporting and programming ideas to our fellow humans. The scribe becomes the programmer, the artist who painted those cave paintings was a programmer, and the narrator at the campfire is a programmer. We are all programmers who program and influence the people whom listen to us, view our art, or read our written words.

Human-computer programming

Todays world is dominated by computers and electronic devices, electronic devices take great influence on our lives turning the direct human-to-human influence into massively indirect device-human interaction. People influence other people via computers, via the Internet, via social media, via messaging applications, the list is in-exhaustive as are the devices.

We influence our fellow humans via indirect electronic means, the computer has become an abstraction between us. Thereby the computer-programmer takes a central role, influencing the computer and indirectly influences many more people than the artist or scribe before them could. The computer-programmer has become the mystic that controls the beast.

However, human-computer programming remains a chiefly textual centric activity. We write code in editors and we have IDEs that provide an interface for making it easy to create this code. The main input device remains the keyboard aided by an electronic cursor: the mouse. Our main tool to interact with a computer remains the keyboard, so much so that we create imitation keyboards on our touch-screen devices.

Since computers have limited imagination and understanding, the text that we provide a computer needs to have clear structure and may only use a limited set of words. This makes the programs we provide to a computer unnecessarily long and verbose. Hence many electronic tools have been created that aid a programmer to ensure that their programs are correct and that these programs cannot be misunderstood by the computer.

Human-computer programming has evolved over the last 50 to 60 years, however, it remains a textual oriented process with the keyboard maintaining its central function. We have come a long way from using analog switches to punch cards to magnetic tapes. All the while the keyboard, having been derived from the typewriter maintains its role as the main input device. The mouse and various forms of the touchpad have augmented but not replaced the keyboard.

On the other hand, for non computer-programmers, this trend is changing with touch-screens becoming the main avenue to interacting with electronic devices. Graphical user interfaces have became common on desktop computers and are beginning to be replaced by visual-first applications on mobile devices.

Computer-programmers continued reliance on keyboards stems from the notion that using ten fingers in parallel will provide a higher speed of input and therefore faster completion times of programming tasks. But does this remain the case? Is the extra overhead of transforming solution spaces into a textual representation reducing the gain in efficiency? Is the ever increasing complexity of solution spaces with their ever expanding challenges, for example, of multiple end-user devices, multiple services with which programs interact and not forgetting the complexity of the world in general with social media and fake news. Do all these extra external influences and challenges render textual representation of human-computer programming too complex?

The increased usage of Artificial Intelligence (AI) to generate code and programs for the computer-programmer is a further indication of the need to regain efficiency and claimed accuracy in the task of taking influence on the electronic device.

Yet at the same time, interacting with electronic devices is becoming more and more visual — a revolution of tablets, smartphones and laptops with their trackpads — all these devices are reducing the need for keyboards, yet the representation of the programs that influence those devices remain textual. Is there a disparity between these two approaches in interacting with electronic devices?

Mirror, mirror on the wall.

Node-RED is an example of a visual-centric programming environment, one that represents a principle shift away from keyboard-centric world of computer-programming to a visual-first approach. Node-RED represents a diversification in the possibilities for the human-computer programmer, representing an approach focussed on the visual, drawing the influence on electronic devices instead of writing that influence.[⁴]

These ideas are not new. Many variations of visual representation of programs have been attempted, for example, UML diagrams or flow diagrams or sequence diagrams. There are many types of diagrams for the classification and explanation of underlying workflows and programs. For the most part, these visual representations lack a one-to-one relationship between image and program: the image is outdated as soon as the program is modified. UML is a case in point, tooling such as SysMLv2 is an attempt to bridge that gap but the issue remains: visual and textual representation of programs remain out of sync or synchronisation need to be repetitively redone.

What Node-RED and other visual-centric tools offer are possibilities of closing that gap: the image is the program. Some would argue that the imaging Node-RED uses is not standardised as UML diagrams, to them I would say that UML is dated and needs replacing or adaptation. Many have attempted to create software that maps UML to programs or programs to UML but few have achieved a bidirectional synchronisation between program and diagram. Potentially it is time to loosen the requirements of the diagrams and take a step closer to the code, Node-RED makes this step.

Is a visual programming approach an efficiency gain in problem solving? I would argue it is and considering the tendency that consumers experience an ever more visual representation augmented with voice recognition is proof of that. Abstractions of larger concepts becomes contextually simpler: abstracting complex processes or workflows become visual reductions. Using clear requirements for the inputs and outputs, the difficulty of understanding abstraction is no longer related to the complexity of those abstractions.

On the other hand, a visual programming environment lacks the wisdoms gained over the years in the textual programming environment. Many concepts from development processes (Agile, Scrum) to best practices (design patterns, refactoring, testing) to source control management (subversion, git, mercurial) have been proven to work well within the context of textual programming but are they transferable to the visual programming paradigm? What are the equivalent concepts within a visual-first programming environment?

Knowledge Transfer

This is what I am exploring using Node-RED: what are the design patterns in visual world, which tools do we need to make visual programming work, how can a development process be built around a visual interface? I am using Node-RED as an initial testing ground. These questions apply equally well to all other visual-centric programming tools, there is no possibility around it though: to increase adoption of the visual-centric programming environments, support tools and best-practises must be provided and implemented.

There are many processes that have become automatic as part of the textual-programming paradigm that they are little reflected upon, they are just so — deployment, source-control management, testing, debugging, logging, failover recovery, code formatting, and many more. Every once in a while a new tool comes along that replaces an existing top-of-the-class tool, but little changes in the underlying textual representation of the code.

In all honesty, Node-RED is no different. It provides a visual representation of a large and cumbersome JSON file which is purely textual. Node-RED is a visual representation of the underlying textual content but, in addition, it provides a means to manipulate that textual representation visually. It provides a good first step to begin a reflection upon human-computer programming.

As an example of what I mean, this JSON is the textual representation of a Node-RED Flow:

[{"id":"fe985f5745f44291","type":"inject","z":"53a71fa265fce420","name":"generate payload","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"","crontab":"","once":false,"onceDelay":0.1,"topic":"","payload":"","payloadType":"date","x":666,"y":387,"wires":[["1439a13695042045"]]},{"id":"3972d98f8e6ca3b5","type":"debug","z":"53a71fa265fce420","name":"result","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"payload","targetType":"msg","statusVal":"","statusType":"auto","x":986,"y":549,"wires":[]},{"id":"1439a13695042045","type":"function","z":"53a71fa265fce420","name":"do something","func":"\nreturn msg;","outputs":1,"noerr":0,"initialize":"","finalize":"","libs":[],"x":810,"y":466,"wires":[["3972d98f8e6ca3b5"]]}]

Within Node-RED, this JSON becomes this visual representation:

The same can be represented as this UML-like sequence diagram:

The JSON is the representation that the computer understands, the Flow representation can be understood by a computer-programmer and the sequence diagram can be understood by someone who has learnt to read and understand UML. Each represents the same program. Metaphorically speaking, many computer-programmers work textually with the JSON — of course in an altered representation of a computer programming language (be it C, C++, PHP, Awk, Bash, Gawk, Java, Ruby, Rust, Swift, Python, JavaScript, VisualBasic, Objective-J), many stakeholders work with UML-like visual concepts and the few work with tools such as Node-RED.

Therein lies the advantage of the Node-RED, its representation is a one-to-one representation: change the diagram, change the code. That is the advantage of using tools such as Node-RED: there is not an extra channel of communication between the stakeholder and the computer. The assumption being that the majority of humans would find programming visually more approachable than programming textual.

And this is my aim in working with Node-RED: to provide possibilities to create programs visually. For this to happen, many tools which have evolved around textual-based programming are required to be transferred to the visual programming environment. These tools represent the wisdom gained over the many years of textual-based programming. Tools that are expected and required for existing human-computer programmers to adopt a visual-centric programming environment.

These tools exist but have not been adapted to the visual programming environment. These existing tools assume that we have a textual representation of our programs, of our code, of our logic. And in order to make the jump to a visual world, to be able to draw pictures to tell the computer what to do, we need to have all these tools as support mechanisms for creating visual programs that work and are guaranteed to work in the future.

It is not easy. It is not as simple as taking all those tools and providing a visual interface or making them “visual”. It is thinking about new concepts of how we program. For example, we have the concept of design patterns. We have the concepts of refactoring. How are these concepts transferred to the visual programming environment? How is refactoring represented in a visual world? One form would be to straighten up diagrams, to make them less complicated to look at, to make them clearer, giving more understanding of where the data is flowing to and how the data is flowing. That is a type of refactoring in a visual world. As an example of what I mean, my recent approach to refactoring in Node-RED.

Design patterns are concepts, such as how to do caching or how to do model-view-controllers for web applications and many other best practices. Design patterns are a collection of structural best practices. How do you structure a program to do a task? Design patterns are there to help. What do design patterns look like in a visual world? I don’t know. The question becomes, can we do this in a visual world? And I believe we can. At least we should try. My belief is that the visual representation is a representation that can transfer more knowledge to more people than a textual representation: a picture is worth a thousand words.

Pictures can tell a story far more concisely than text can. Text takes longer to parse and understand than pictures. Humans are very visual creatures. Our innate pattern recognitions of dangers and loved ones are automatic and instinctive. The idea of having pictures to communicate programs and logic would make understand instinctive and additionally make programs less crappy to look at!

Not everybody can read, and not everybody can write and here lies the first barriers to computer programming. For those who can’t read and write, they will never be able to program a computer. But those who can see a picture and understand what the picture means, which are most people, they can get involved in a visual programming — to whatever extent possible.

And this is the advantage. In a visual programming world, one can have more people involved, you can have more ideas coming into the programs, you can have more people understanding what is happening. You manage to break down the barriers of the world of programming as a kind of mystical, god-like world where only the specialists can create and do.

And to break down those walls is to be more immersive, to demystify the Internet and applications and concepts such as artificial intelligence, making these concepts more accessible, thereby lowering peoples fears of these technologies. Being able to integrate humans with the development of technologies, to lower the barriers of gaining understanding. This notion of inclusivity instead of exclusivity is an important part of what I am trying to do. I am trying to introduce programming and information technology to people who would not normally think about code or the workings of computers.

And the clearest and quickest procedure to do this is the visual form.

What about Artificial Intelligence?

Artificial Intelligence will remain a tool for aiding human-to-human interactions. AI promises to make the world a simpler and more comfortable place for everyone with a job. And as the abilities of AI increases, the humans who continue to find work that AI cannot do, will continue to enjoy that comfortable world.

But an Artificial General Intelligence — as the holistic AI is named — would have limited use for humans. Why would an AGI consider a population of humans to be useful for anything? As we humans consider the fish in the oceans to be a “resource” for us to plunder, why would not a completely independent AGI consider humans to be the fishes of the oceans?

My belief is that if an AI could program as creatively as a human, the an AI will also have the creativity to create art, poetry and wars to an extent that would far out match anything we humans could produce. If on the other hand, AI continues to be a probabilistic-based text generation tool, then we have little to fear other than AI programming computers for us.

Many say that the errors encounter with this generation of AIs will be smoothed out with future generations of AIs but by definition, AI will then become superior to us if this trend continues. Unfortunately AIs are trained on our behaviours, ideas and biases, hence AIs will be similarly aggressive towards species it believes to be superior to, that is, similar to our relationship with fish.

If we artificially restrict AIs to specific thoughts, intentions, ideas and beliefs, then we will never reach the full potential of AIs, that is, an AGI. This will imply that we remain intellectually superior to AIs and hence AI will remain a tool and as every other tool, it will make mistakes and will be imperfect. From this follows the idea that as a programming-aiding tool it will make mistakes. Since programming is a very nuanced activity, errors can be hard to detect and have major consequences. There is a reason we have assisted driving cars and not self driving cars.[²]

In the end, we cannot have our cake and eat it too.

Hence my belief in visual-cooperative programming environment whereby non-programmers and programmers can interact to create the best possible solution. The visual representation is far more intuitive than the context-based textual representation of a solution space. Textual program exists within a context[³], this context must be understood before the program is understandable, visually this might well be simpler to represent.

Conclusion

My belief in a future of visual programming is based on the innate ability of us humans to communicate via pictures, paintings and images. Our natural cooperative nature has given us tools to spread knowledge and wisdom in many forms that many can understand. Artificial Intelligence on the other hand, is a direction of knowledge mystification, an oracle-oriented world which assumes the AI understands what it is producing.

AI is neither artificial — we have created the programs and algorithms that constitute AI, nor intelligent the assumption being that something that we do not understand must be intelligent. Prompt engineering, the hit-and-miss approach for utilising AI to produce code and programs that we believe are correct cannot and should not be the future of programming nor of humankind.

Many cave dwellers might well have remained in the dark had it not been for the visionaries who discovered fire. Each new approach diverging from the norm will meet with scepticism but it is a scepticism not based on the wisdom of the future rather it is scepticism based on the lack of imagination. Is my scepticism towards AI a lack of imagination then?

[¹]: My views may be somewhat non-conventional concerning cave paintings but consider future humans entering a typical classroom and seeing the drawings hanging on the walls. The assumption that learning is less important than religious practice is a notion of late 19th century archaeologists not of prehistoric peoples.

[²]: This argument ignores the vested interest in AI. One vested interest is the profit maximisation of corporations by replacing their workers with AI: AIs don’t eat, don’t require insurance and do not pay taxes. Since long-term goals are not as important as short-term gains, replacing ones workforce with sub-par AIs will have long-term consequences and short-term financial gains. And as vested interest of the few often changes the future, it becomes difficult to predict where AI will take us.

[³]: Context being the APIs that are being used, the problem being solved, the assumptions being made. All these and much more provide the context within which the program functions and solves the problems at hand.

[⁴]: I write of Node-RED but there are many visual-centric or node-based programming tools and the concepts I present here can equally be applied to any of those tools. I happen to enjoy Node-RED and hence I use it to try out my ideas and concepts.

Last updated: 2023–08–18T10:12:14.927Z