When Big Data Gets Too Big

Feature Story

Computer modelling has long been used as a design tool, and big data is a massive resource on which to build these models. But when the datasets get too big for computers to handle, innovative design solutions are needed. And the urge to play has driven technology forward

It’s a common science fiction trope popularised by The Matrix films with roots that go back to Descartes’ epistemology and Plato’s metaphysics; what if the world as we know it is an illusion? Specifically, a virtual reality simulation running on a computer. But beyond the fantasy, computer simulations - from models of systems within the human body to models of the flow of traffic through a city - have many practical applications from healthcare to infrastructure design. And beyond these practical applications lies the dream of creating a simulated brain that can match, and eventually exceed, the abilities of the human mind - with all the unforeseeable advances that would follow. However, the virtual playground of computer games seems to be the biggest testing ground for new design ideas.

The advent of big data would appear to be a boon to designers of simulated, virtual worlds. Living in a massively interconnected world, each of us produces gigabytes of data every day whether we intend to or not. Without even realising it, every human living in an advanced economy generates data through the automatic recording of our transport use, shopping habits, telecommunications, internet use… in fact, almost every aspect of daily life generates data. All of this big data is available to be fed into computer models, making them ever more accurate, ever more precise and more detailed.
Can designers of virtual worlds handle all of this data? Ideally, every data point could be fed into a simulation to create an almost perfect simulacrum of the real world. It is tempting to think that all that is needed is more computing power to process all that data. But there is a problem.

Hitting a wall

There’s a limit on what is possible. Landauer’s principle, first proposed by Rolf Landauer in 1961 while he was working for IBM, is derived from the fundamental laws of nature; the conservation of energy and increase of entropy. The principle shows that there is a fundamental limit to what can be computed for each joule of energy a computer consumes. So, what does all this rather abstract sounding theory have to do with big data and virtual reality? In short, it demonstrates that a computer running a simulation of a single functioning human brain, with existing technology, would have the equivalent power consumption of a medium-sized country. For example in 2007, it took IBM’s Blue Gene supercomputer to simulate ‘half a mouse brain’. And this massively powerful machine was only able to simulate the brain at half-speed.

However, computers are getting better all the time. Moore’s Law is the widely known rule of thumb that states that, as technology advances, the computing power that a dollar will buy you doubles every two years. There is an equivalent law, Koomey’s Law, named after Stanford Professor Jonathan Koomey, which states a similar increase in computational efficiency. However, in around the year 2048, Koomey’s Law will hit the impassable barrier of Landauer’s Principle, and further progress will be impossible.

Even if quantum computing becomes a reality, there is a yet more fundamental limit to computational efficiency; the Margolus-Levitin Theorem, that puts a limit on what it is possible to compute; 6x1033 operations per second per joule.

So when a brute force approach is shown to be inadequate, maybe a design solution can help.

Good design beats brute force

SIGNED spoke to the artist Lawrence Lek. His work, which was on display at this year’s Venice Biennale, involves creating complex virtual worlds and simulated environments. He explained that “There’s a difference between images that are rendered in real-time, as with computer games, and those that are pre-processed, as with each frame of a Pixar film or a Hollywood special effects sequence. These can take hundreds of hours of computing time, made using networked render farms.”

To produce images in real time requires the design of clever systems, Lek explains; ”In more passive media like films, there’s not so much need to optimise the performance like there is in a video game. Virtual Reality formats are even more performance-intensive because of the increased resolution and higher frame-rate; 60FPS as a working minimum.”

Lek uses a variety of tools including Unreal Engine which allows him to create realistic VR environments, where players can reach out, touch and interact with objects in the simulated world. Lek also employs the Unity engine, which is increasingly popular among game developers in particular.
The functions of engines such as Unreal are broken down into a rendering engine and audio engine, which generate the 3D graphics and sound and a physics engine, which emulates the laws of physics, gravity, buoyancy and so on within the virtual world. These functions will coordinate with an AI, responsible for controlling entities within the simulated world.

Lek explains; “Of course all this is also dependent on the complexity of the scene itself; a busy urban scene, with realistic lighting, explosions and animated characters is far more difficult to calculate than a simple stylised scene of a cube in a carpeted room.”


For a physics engine to realistically model the movement of, say, a person’s hair as it flows, flexes, is acted upon by gravity and by the air around it is a complex task. To model every hair on a head requires huge processing power alone. Designers must find ways of working around such complexity, approximating the movement of whole collections of hair rather than individual strands for example. In the case of virtual worlds created for leisure - such as video games - it only has to be good enough to allow suspension of disbelief, all the VR tech in the world is useless without a well-structured story, told with skill and verve. On the other hand, virtual worlds built as practical models will have different priorities. A virtual model of an aeroplane cabin, constructed to test how the structure’s occupants will fare in the case of a hard landing, will need to simulate certain properties of the human body in order to find out what happens to it on impact. However, including the aforementioned realistic modelling of each occupant’s hair, would be a waste of computing resources.

Future virtual worlds

For an indication of what the future may hold for designers of virtual worlds, it is worth looking at Spatial OS. This system, developed by the firm Improbable lets users make more intricate virtual environments than ever before by combining the abilities of different engines and breaking the simulated world into manageable chunks that can be spread across multiple servers. This, in turn solves a longstanding problem; object permanence. In short, in the past designers of virtual worlds have saved on computing power by having objects within the simulated environment disappear whenever the user is not looking at them. While this helps simulations run faster, it also means that changes made to the environment by the user are sometimes lost - radically limiting the usefulness of the simulation. Seamlessly spreading a virtual world over a number of servers allows for ‘persistence’; the ability of a simulation to continue running while the user is away.

There are other potential benefits to this type of arrangement. In future, anyone wishing to simulate anything in a virtual world will be able to tap into the work of other designers, saved on servers anywhere in the world, in order to add new elements to their own simulation. For example, when building a virtual world to test a self-driving AI, a designer will be able to call upon pre-existing simulations of road systems, complete with simulated pedestrians and traffic lights already waiting for them on a server. They could even add realistic weather conditions to their own virtual world borrowed from another simulation running elsewhere. And through this collaborative, ad-hoc approach, ever more complex virtual worlds are possible.