Introduction #

Computer security is very closely related to non-computer security. That’s why we will start with an analogy in a non-computer world and try to transfer the concepts to the computer world.

Building a house (like if we are building an app) #

Imagine that you want to build a house for you and your family. What properties should this house have? Well, it should be of a certain size, should be painted a certain colour, have a certain number of rooms which will be designed in a certain way, it should be built in a suitable spot, like somewhere where there is a road to get to the house, where there is electricity and water.

But, more importantly, it should be safe, it should be built in a way that the walls, ceilings, and the roof are unlikely to collapse on you while you sleep, even when there is a thunderstorm outside.

It should also be private. A fully transparent floor-to-ceiling window in the bathroom is not desirable for most people. Same goes for the bedroom and to some extent other rooms too.

It should be secure. You want to store some valuable things, like family pictures, jewellery, money, lawn mower, and so on, in your house. When you go on a holiday, you want to relax knowing these things in your house are safe from thieves.

Can you build a house that is 100% safe? In the real world, you can’t. If a meteorite comes, even the strongest standard house roof will collapse. Is there anything you can do if you want to secure your house from meteorites? Well, you could build a bunker buried a few meters underground, but that probably isn’t a house you’d want to live in for very long. In any reasonable case, once the sun will exhaust its hydrogen fuel in couple of billion years and die, eating up the planets in our solar system, including Earth, your house will most likely not be safe from that.

Can you build a house that is 100% private? I mean, if you remove all the windows or build the aforementioned underground bunker, you can make it very private. But some information about what’s happening inside the house would still leak. The electricity company could see from the electricity consumption if anyone is present inside the house. The water company could see when you are taking showers. The garbage company could see which stuff are you throwing out (and there is a whole lot of information one can learn from that). You can always get solar panels, recycle all water and grow your own food, but you’ll still have to get out once in a while to buy more advanced supplies and besides, going off-grid to this extent will take an amount of work few people are willing to put in.

Can you build a house that is 100% secure? Well, you can have multiple fences, setup retina-scanning systems, hire guards, train dogs, install cameras. But very likely, with high enough effort and budget, someone will be still able to eventually get in. Whether it’s discretely, think heist-movie-style breach, or by pure force, think SWAT-style raid. Humans make mistakes, technology can fail too. The highly dedicated thief who wants to get inside your house has time on their side. They can wait years for that one occasion your guard is sick, your cameras are out of service and use this one opportunity to their advantage.

“Well, if I can never achieve perfect safety, privacy or security, why even try then?”

you, right now, probably

Saying there is always a way to violate these properties doesn’t mean violating them is always easy.¹ The spectrum from a completely unprotected house to a hyperparanoid underground bunker is very wide. There are a lot of measures you can take to make your house reasonably safe, private and secure.

What does reasonably mean? Well, it all comes down to the threats you want to eliminate when designing your house. Do you want to consider the threat of a meteorite hitting it? The chances for that are really small, so most people would just ignore it. What about burglars? An alarm system or CCTV are pretty cost-effective solutions, and you can put more valuable items into a special safe, or even a bank. Also, strong locks delay the burglars long enough to get them likely caught. That won’t protect you from Tom Cruise, but at that point most burglars would rather break into your neighbour’s house, since he has no alarm and regularly forgets to lock his door.

There is also legislation that puts requirements (norms) on the house to be legal and approved for living (kolaudace). Their goal is to make all houses at least somewhat safe using some standardized approaches, so these often go hand in hand with the reasonability, but not necessarily. But to build a house legally, you need to follow them whether you like them or not.

As you might also notice by now, in general, measures to counter threats go against other features you might want the house to have, they affect its usability. Big windows are beautiful and you get a lot of sunlight, but they are not compatible with comprehensive meteorite or theft protection. Balancing protection and usability, along with resources, like your time, energy and money, is what protecting your house comes down to in the end.

Building an app (like if we are building a house) #

Creating a computer program, a web or mobile application, an embedded system – in general an “app”² – is quite similar to building a house in many aspects, but in some aspects it is even worse.

When building an app, you also have many requirements for it. Whether from you personally, your company, your client or other stakeholders. How will it look, what features will it offer. You want it to be “safe”, “private”, and “secure” too.

Safety of apps can be understood in terms of dependability and reliability – we don’t want our app to crash all the time, our users to get frustrated, get incorrect or unexpected results etc. This is not the topic of this course, although closely related, but it’s covered at the Department of Distributed and Dependable Systems in many of their courses on our faculty. Safety can be also understood in terms of safety of the user. For example, if we are building a social network, we want to make sure that harmful content is not spread between our users. We also won’t spend much time covering that here.

Privacy and security are almost directly transferable. When you are building a document cloud, you probably want the documents to stay private, unless the owner explicitly shares them. Building a bank app? It would be pretty bad if someone could drain an account that is not their own. Those are very obvious examples of security and privacy violations, but the border is often more blurry.

Is it a privacy issue if anyone can learn whether given email address is a user of the document cloud using the password reset feature? Is it a security issue if the server returns the specific server version in the HTTP response header? In cases like those, answering a simple question like “Is this behaviour a security issue?” requires some thinking, typically in a process called threat-modeling, which we will get into later.

Even in the computer world, there is no such thing as 100% security or privacy, or at least practical and provable perfect security and privacy.³ The halting problem (and it’s extension, Rice’s theorem) tells us, that we can never prove that a program is “secure”, no matter how we define “secure” (as long as it’s a non-trivial property).

Legislation and norms (like PCI DSS, Zákon o kybernetické bezpečnosti) also exist, to set the bar for what is reasonable security and privacy. As with any law, when it first gets written, many edge-cases are not handled by it and only by gradual development does the letter of the law start approaching the original intent. This is especially true for laws regarding cybersecurity – an incredibly new and complex field, which changes drastically almost every day.

Almost every application is going to be used by people (at least for now) and that means it has to be usable. If an application required a 100 character password, email and phone confirmation and five different hardware tokens to log in, not many people would use it. There is often a trade-off between security and usability and finding the right balance is hard. When a security measure decreases usability too much, it can also drive users to insecure behaviour, which can actually make the situation worse rather than better. Users forced to change their password every month will often resort to small modifications (like appending a sequential number) and might forget their password more often, overloading tech support with reset requests and making them more vulnerable to phishing.

The extra problems with security of computers, especially on the internet #

We will now argue why computer security is hard, going over some of the problematic aspects.

Complexity #

Computers are very complex nowadays. It’s also very cheap to make them, so they are fitted everywhere, even in cases where something much simpler would have been sufficient.

Consider this example state diagram for a dishwasher (like the ones from the 1970s) with a simple controller:

stateDiagram-v2
    [*] --> Idle

    Idle --> Filling : start && door_closed
    Filling --> Prewash : water_level_raised
    Prewash --> DispenseDetergent : prewash_timer_done
    DispenseDetergent --> Soak : detergent_tank_empty
    Soak --> MainWash : soak_timer_done
    MainWash --> Rinse : mainwash_timer_done
    Rinse --> Drain : rinse_timer_complete
    Drain --> Dry : no_water_out
    Dry --> Done : drying_timer_complete
    Done --> Idle : door_solenoid_open

    Idle --> Error : sensor_fail
    Filling --> Error : no_water
    any --> Error : catastrophic_fail
    Error --> Idle : reset

Even though this is a bit simplified, it is not oversimplified, back in the day, the logic on the hardware level in the dishwasher really did switch between these states. It was impossible, other then a sensor failure, for the dishwasher to be in a different state. Easy to audit from both the safety and security points of view. Those who attended NTIN071 Automata and Grammars see, this is because the dishwasher can be represented just as a DFA and we have a plenty of theory for analysing them.

Now, have a look at this simplified, far from complete, diagram representing a smart WiFi-enabled Linux-based dishwasher:

stateDiagram-v2
    [*] --> PowerOff
    PowerOff --> Booting : power_on
    Booting --> Operational : kernel_and_services_up
    Operational --> PowerOff : power_cut

    state Operational {
      [*] --> Orchestrator

      %% Orchestrator coordinates user actions and scheduled cycles
      state Orchestrator {
        [*] --> SystemInit
        SystemInit --> ServiceManager : init_systemd_like
        ServiceManager --> AppAgent : start_user_agent
        AppAgent --> CycleController : user_start / schedule_event
        AppAgent --> Maintenance : enter_maintenance
      }

      %% CycleController manages the washing sequence (explicit wash states)
      state CycleController {
        [*] --> IdleCycle

        IdleCycle --> Filling : start_cycle / door_closed
        Filling --> Prewash : water_level_ok
        Prewash --> DispenseDetergent : prewash_done
        DispenseDetergent --> MainWash : detergent_released
        MainWash --> Rinse : mainwash_timer_done / sensor_confirm
        Rinse --> Drain : rinse_complete
        Drain --> Dry : drain_complete
        Dry --> Completed : drying_complete
        Completed --> IdleCycle : user_remove_dishes

        %% asynchronous interrupts and sensor feedback
        any_state --> Error : catastrophic_sensor_fail
        Error --> IdleCycle : reset
      }

      %% Hardware control plane
      state HW_Controller {
        [*] --> WaterSubsystem
        WaterSubsystem --> Pump : run_pump / stop_pump
        WaterSubsystem --> Valves : open_valve / close_valve

        [*] --> MotorSubsystem
        MotorSubsystem --> SprayArms : spin / stop

        [*] --> HeatingSubsystem
        HeatingSubsystem --> Heater : heat_on / heat_off

        Dispenser --> ReleaseMechanism : open / close
        Sensors --> HW_Controller : interrupts / readings
      }

      %% Sensors & telemetry (many inputs influencing the wash)
      state Sensors {
        [*] --> DoorSwitch
        [*] --> WaterLevel
        [*] --> TempSensor
        [*] --> TurbiditySensor
        [*] --> VibrationSensor
        TurbiditySensor --> CycleController : soil_level_hint
        TempSensor --> HeatingSubsystem : temp_feedback
        WaterLevel --> WaterSubsystem : level_feedback
      }

      %% Connectivity, cloud and updates
      state Network {
        [*] --> WiFi
        WiFi --> CloudConnector : mqtt/https
        CloudConnector --> MobileApp : push/telemetry
        CloudConnector --> OTA : check_for_updates
        OTA --> AppAgent : deliver_update
      }

      %% Local interfaces and indicators (no WebUI or SSH)
      state Interfaces {
        [*] --> LocalUI
        LocalUI --> SoundLighting : beep/leds
        LocalUI --> MobileApp : pairing / bluetooth_handshake
        USB --> Logs : log_dump / usb_storage (diagnostics)
      }

      %% Userland where arbitrary code and interpreters may run
      state Userland {
        [*] --> ShellLikeEnv
        ShellLikeEnv --> Interpreters : python / lua / node (embedded)
        Interpreters --> AppAgent : run_script
        PackageManager --> AppAgent : install/package_update
        Cron --> Scheduler : scheduled_tasks
      }

      %% Cross-links (asynchronous, non-deterministic)
      AppAgent --> Network : telemetry/upload
      AppAgent --> CycleController : high_level_commands (change cycle, pause)
      Interpreters --> HW_Controller : script_invoke_actuator
      OTA --> PackageManager : install_package
      USB --> AppAgent : local_update_payload
      Logs --> CloudConnector : upload_logs
    }

    note right of CycleController
      The actual wash states (Filling → Prewash → MainWash → Rinse → Drain → Dry)
      are orchestrated by software agents.
    end note

    note right of Userland
      Interpreters / package manager / shell-like env
    end note

    note left of Network
      Wi-Fi, cloud APIs, OTA, mobile pairing, USB (diagnostics/updates)
    end note

This is much more complex than the previous dishwasher. It is very hard to reason about how this dishwasher functions. Whether it’s safe or secure. Whether it won’t go into some unexpected weird state, that could be used by an attacker to do something nasty. Again, NTIN071 Automata and Grammars enjoyers understand that this is because this dishwasher contains one, or even more, Turing-complete computers, so it is even impossible to represent the dishwasher with a diagram like above. These Turing machines are then emulating a DFA similar to that of the old dishwasher, with some states added to represent the new “features” of the smart dishwasher.

While safety and security of an house can feel intuitive to most humans, security of a DFA can be still reasoned about, security of an abstract program running on a Turing machine is counter-intuitive and undecidable in general.⁴

The complexity and emulation is hidden to the user, who is just clicking buttons to start the cycle, as they are used to, on the dishwasher (or in the mobile app). But it is often also hidden to the programmer, who is programming in a high-level language, unaware how the superscalar multicore CPU running an operating system consisting of millions of lines of code will execute the instructions that will make up the execution of the program they wrote, using dozens of libraries and a sketchy SDK.

We are living in an age where converter dongles have CPUs more power than the hardware that was used to land on the moon.

Scale #

When you want to break into one house, it will take some effort – you have to get to the house first, pick locks or break windows, find all the valuable items, somehow carry them to your hideout and do all that without being spotted. When you want to break into a second house, some tasks might be easier, since you have more experience, but you’ll still have to do most tasks again. We say this attack doesn’t scale well.

On the internet, you have access to anything and everything, all of the time, which is probably the best thing about the internet, but it is also the worst thing about the internet for security (and nothing else, right?). If you can leak private data of one user from a database, you can probably leak the data of all users. If you find a vulnerability that allows you to gain control of one internet-connected light bulb, there are thousands more you can start blinking with. If you want to look what’s happening on a random parking lot in Russia and in some church in Florida, you don’t have to travel to the other side of the world, thanks to hundreds of unsecured CCTV cameras. Breaking one thing is usually about as easy as breaking thousands of things. As an attacker you can massively reuse and parallelize to target thousands of victims at once. And as a defender, must defend your app from all the attackers across the globe at once.

Dependencies and interactions #

What is really good about computer code and programs, is that it can be easily, cheaply and quickly copied and tweaked. People and companies got used to that. If you want to build a web API app, why don’t you use an existing framework to do the things every web app has to do anyway? Want to parse a certain file? There is a library to do that. What to check if a number is odd? Well, if you are in JavaScript, there is a library for that. And another one for checking if it’s even.

This makes it really fast to write anything - and is also one of the reasons why computers are everywhere now, even in places where more simplistic logic circuit would be sufficient. But it can generate security issues.

It is not necessarily just that the included library can have a security issue itself - like in the case of a DOMPurify bypass, or its author (or someone else) backdooring it on purpose like in the case of tinycolor being compromised. It can be the interaction between the surrounding code and the library that makes the specific scenario insecure.

When you use a library, you rely on the assumptions the developer of the library made about how it is going to be used. When these assumptions are not understood by the developer using the library, it’s easy to make a nasty mistake. There are plenty of these assumptions: what is and what is not user-controlled (and so potentially attacker-controlled), but also what is and what is not a filesystem path, a JSON, an IP address, an email address, and so on. Although exact specifications exist for many of these, developers often tend to not fully follow these specifications.

As an example of a misunderstanding what should be user-controlled, take this snippet of code in Java, where we consider the username to be user-controlled:

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;

(...)

Logger logger = LogManager.getLogger();
logger.info("User trying to log in with username: " + username);

This might seem innocent at first, but the Log4j logging library has a feature called lookups, that allows to access system and environment information, just by putting a string like ${...} in the first argument of the logging functions. So if the user set ${env:SECRET} as their username, the contains of SECRET would get logged. Doesn’t seem that bad at first. But using JNDI lookup, that was enabled by default, it was possible to load arbitrary Java class from the internet and thus execute arbitrary code. This vulnerability was called Log4Shell and was present in many products using Java and the Log4j library, including Minecraft, Cloudflare, and iCloud.

There are many other examples like this.

It is expensive, time-consuming, and doesn’t make money #

Because an average user cannot tell a difference between a hardened and an insecure app, security is very easily overlooked. For managers, making an app secure can seem like a bad investment of resources. Especially, because with computers making it work somehow is much much cheaper than spending time making sure it works properly™.

Although this attitude is changing in recent years as the stakes and consequences of underestimating and underfunding security become more apparent, situations where it can be encountered are not rare.

Managers also tend to prefer security as a product – antivirus software, enterprise firewalls, and similar, even though what is often truly needed is a massive change in mindset, policy and employee education.

TLDR #

Safety, security, and privacy are closely related, but independent concepts.

100 % security is impossible, and is often a trade-off with usability.

Securing computer systems is hard:

Computers are complex and are everywhere

Everything is connected to the internet and attacks on computer systems scale extremely well.

Projects have dependencies which can be vulnerable or compromised.

When developers don’t understand their tools well, innocent features may interact in a way which creates a vulnerability.

Further resources #

Thomas Dullien: Security, Moore’s law, and the anomaly of cheap complexity

Just like in math, when you know some problem has to have a solution, but actually finding a solution to prove the solution exists is hard. ↩︎
How Steve Jobs would want to call it. ↩︎
Well, there kinda is, actually! In cryptography, there are ciphers that are provably perfectly secret, like One Time Pad, that give attacker no information about the plaintext being transfered. For more information refer to the cryptography lecture of this course, or the Introduction to cryptography course. But when we consider practical systems, like whole programs, computers, a proof of perfect security doesn’t exist. ↩︎
This, unfortunately, also means that when a company is selling computer security solutions, they can quite easily sell snake oil to their clients. The real impact of their solution on security is often only a matter of opinion and people can have a lot of different opinions. ↩︎