Requiem for a Stream: How to write Streams Good

Posted by on May 01, 2020 · 15 mins read

You are eight years old. you are visiting Aunt Beatrice in Tunbridge Wells where your older cousin is showing you how to solve a Rubix cube. Talking at a million miles per hour, hoping to lose you, trying to show off how clever he is. “Rotate. Invert. Oscillate. See? Reverse. Rotate again.” Aunt Beatrice chimes in “There’s no point Augustus, he’s too young, he won’t get it”. He hands you the scrambled cube, smug grin on his face, “Did you get that?”. You desperately try to follow – you are NOT a baby – you hate visiting your cousins.

You are sixteen years old. You have just got a D in your Additional Maths GCSE. The words of your Maths teacher Mrs Valchek ring in your ears: “this module has some pretty complicated concepts and is not to be taken lightly.” You took the module anyway: you are NOT an idiot and you thought you could prove it by wandering into an exam armed with the knowledge of half-remembered Year 11 Maths lessons and a half-dozen advanced Churchill papers.

You are twenty-four years old. You are sitting in a programming interview. The question is simple enough – too simple to really blow someone away with a measly solution. You are NOT your average programmer and you know just how to show it: your good old friend streams. That pillar of proving that you are clever in interviews. Functional. Readable. Java 8. Modern. Very sexy. Yeah. It’s time.

Your hands shake as you reach for the keyboard. The power of a half-remembered second year Computer Science module and a half-dozen cans of Monster Pipeline Punch flow through your veins. In a blaze of streams you cut through the question in seconds. Take that Augustus! But something is wrong, there is a part two to this question, you also need to report frequencies for monitoring. The interviewer shoots you an imperceptible smirk.“Do you understand the requirements of part two?”

You relax, you breathe, you hate interviews. There must be a way out of this and streams is definitely the answer. Filter. Count. Map. Reduce. Collect. Dangit, now you have to write a custom collector function. What the function is a BiConsumer? How do you return a String and a Map from a lambda function in Java? Your house of streams comes crashing down. The interviewer leans over. “Why don’t you explain to me what you’re trying to do?”

You are sixteen, the big red D quivers in your hands. Valchek’s words ring in your ears as she outlines what RTFQ stands for. You are eight, grappling with the Gordian knot of coloured squares in your tiny, jam coated hands. Darn thing won’t turn. You are NOT an idiot. Your cousin leans over you, “let me show you again.” You fly into a blind rage, your heart going a mile a minute from the anger, or maybe its the Pipeline Punch? “It’s not my fault, the damn thing won’t work!”

Your interviewer leans back, their eyes glazed over, their mind made up. “Why don’t you start with a for loop?” You curse under your breath, sweat covering your face as the jam from your hands drips loudly onto the keyboard they provided. You’ve just soiled yourself in this interview, figuratively and literally.

Streams are good though?

Streams are a good thing, right? They allow for an incredible level of readability, making for code that is close to plain English, all in a perfect first-order functional paradigm. No side effects, singular return, no mutable state. Writing your code in this style requires some work but the process of doing so feels like you are cleaning up messy code. Removing unnecessary variables, extracting required constants to top-level immutable variables, validating inputs first to remove the possibility of exceptions during processing. Not to mention that it gets rid of all those ugly nested loops and curly brackets. But then why does working with them sometimes feel so awkward and confusing?

If our streams aren’t being written in a consistently functional environment, the beautiful, concise interface they provide rapidly transforms into a snarling hellhound of unreadable excrement. What was five commands in your head becomes a sprawling mess of indecipherable lambda functions, riddled with disgusting try-catch blocks and spattered with orElses and getOrDefaults. Often what was one stream may be split into two or three streams to facilitate multiple outputs or flow control. The precious code has been befouled, figuratively and literally.

Sometimes writing streams feels satisfying, straightforward and natural. The way all coding should be. Other times it’s a frustrating mess that wastes your time and is often thrown out the window and replaced with loops in a flurry of frustration. It can be very hard to tell beforehand which experience you will get but after a few goes on the merry-go-round of writing code, one can start to pick up a few tricks to ensure you see more of the good side of streams and less of the bad.

The Rules

1 – Do not try to use streams for everything.

Programmers are notorious for falling into the trope of the hammer-wielding moron who sees naught but nails. (See: Kafka, Streams, Blockchain, Kubernetes, Blockchain, Machine-Learning and Blockchain). We all know how tempting it can be to ram that square peg of yours into whatever circular hole your ticket directed you towards, but it is not the way forward. The wise Streams user first assesses the problem space and knows when to use them: what is a nail and what is a screw?

Avoid using them on low level code – pesky corporeal things such as networks and disks are unreliable and must be handled with care. This leads to exception handling and edge case management that will complicate streams. If you really insist on using streams you should encapsulate this exception handling by extracting data and verify integrity long before you start processing it.

Generally you should look for anything that pushes you away from a first order functional paradigm. One stream returns one object: them’s the rules. This makes them the perfect tool for abstract tasks such as data processing or toy problems but causes problems in production code when secondary outputs like reporting and monitoring are also required. Not to say that these vital tasks cannot be performed in a functional manner, but they will require your codebase to be structured functionally from the get-go.

2 – Write your own methods

A lot of the ugliness that arises from streams is when you are trying to do an action that isn’t quite supported by the API you are using. The default solution, the one normally suggested by IDEs (morons) is an in-line lambda function. Whilst tempting this is often where readability goes out the window. I don’t care if there is only one parameter; if you call it “x” I will call you names I am not allowed to type on a work laptop. Furthermore unless your anonymous lambda is so painfully obvious that your cousin Augustus could understand it, even after he was kicked in the head by a horse on that school trip back in ‘09 (that’s 2-1 to me, Augustus you loser), then extract it to its own method. Method names are powerful, useful tools that your colleagues will thank you for using properly. You should find the resulting method reference satisfyingly readable.

When writing streams complexity can compound quickly. A rule of thumb for coding that gets around is that 10 lines is about as long as a method should be. When writing streams you may want to be even more trigger happy with breaking down code into logically intelligible chunks, even if those methods aren’t reused. It is prudent to break out a new method whenever you are doing something specific to the domain you are working in.

public LocationPair findClosestLocationsWithFeature(List<Location> points, String feature){
  return points.stream()
    .filter((factory -> factory.stock.contains(feature)))
    .map((Location factory) ->  (points.stream().map(point -> new LocationPair(factory, point))))
    .reduce(Stream::concat)
    .orElse(empty())
    .min((locationPair, what)-> (int) Math.sqrt(
        Math.pow((locationPair.firstLocation.latitude - locationPair.secondLocation.latitude), 2) +
            Math.pow((locationPair.firstLocation.longitude - locationPair.secondLocation.longitude), 2))
    )
    .orElse(null);
}

This is ugly as hell

public LocationPair findClosestLocationsWithFeature(List<Location> points, String feature){
  return points.stream()
    .filter((factory -> factory.stock.contains(feature)))
    .map((Location factory) ->  (points.stream().map(point -> new LocationPair(factory, point))))
    .reduce(Stream::concat)
    .orElse(empty())
    .min(LocationPair::getDistanceBetweenLocations)
    .orElse(null);
}

This is not

This may seem like the basics writing of good code, because it is. However obvious they may sound these are often forgotten when writing streams due to the flexibility of anonymous lambda functions and the desire for concision. Really it is more important than ever to sprinkle in syntactic sugar when streaming in order to keep up that flow of understandable imperative commands. Where you see ugliness, stick it in a method and slap a name on it and now you have an argument for not calling it ugliness.

3 – Blame Java

Okay, lets be real here. Java is crap – I know it, you know it, we all know it. Back in the day it was revolutionary for the guarantees provided by the JVM. Now it’s a dying blob of a language, trying to suck in features from other languages to be crudely assimilated into its sprawling ecosystem. Kept alive by the momentum of days long gone and the thousands of schools heartlessly pumping out wave after wave of wide eyed Java programmers like lambs to the SlaughterFactoryMangerProviderHelperImpl.

static final List<String> hotDrinks = Arrays.asList("tea", "coffee");

public List<String> hotCustomers() {
    return customers.stream()
        .filter(
            cust -> hotDrinks.stream().anyMatch(
                drink -> cust.drink.contains(drink)
            )
        )
        .map(Customer::getName)
        .collect(Collectors.toList());
}

This is an example of a “straightforward” stream in Java

val hotDrinks = listOf("tea", "coffee")

fun hotCustomers(): List<String> =
        .filter {
            cust -> hotDrinks.any {
                cust.drink.contains(it)
            }
        }
        .map {it.name}

Here through Kotlin’s use of the it keyword and the removal of the painful collector pattern a lot of the boiler plate has been removed, leaving behind code that actually describes what you want.

It is really impressive what streams are capable of when they are used in a language that didn’t hastily graft streams onto the side as a rushed afterthought. Just like the way your cousin Cleatus duct taped that “jet pack” made of twelve deodorant cans and a blow torch to his back before shattering his pelvis off the cliffs of Dover. On a superficial level a lot of the eye-sores that litter Java streams vanish in a puff of well-thought-out code, and if used properly they can do wonderful things like flow control in a way that is legible to actual humans.

authorization::read_set_of_lists(&path.username, &logged_in_user, &state)
	.and_then(|_| state.database.get_lists(&path.username))
	.map(|lists| HttpResponse::Ok().json(lists))
	.map_err(response::from_error)

Here any errors that arise from read_set_of_lists OR database.get_lists are handled by Rust’s map_err method which interprets the error.

The above code written in Rust demonstrates how streams can actually manage exceptional cases in streaming without adding on orElse(null) every other line. By handling problems in data processing at the end of the Stream, the syntactic flow of the “happy case” is uninterrupted by what may well be theoretical problems caused by processing of the data set. If you are of the opinion streams are a fancy way of complicating loops that shouldn’t be used, I advise you to seek them out in the sort of functional environment where they belong. I would also advise you that “map” and “reduce” are hardly new concepts in the realm of data processing.

Conclusion

You are Thirty-Two years old. You are in another interview, this time for a senior position. You stare at the toy problem that is something to do with searching a set, you think. You have long forgotten how to write the CompSci-101-esque code required to solve this problem. You are on the verge of having a panic attack but you do not show it. You are used to imposter syndrome now. You hide your quivering hands by stroking your chin. You lean back and say, with a well honed faux-sageness, “Streams would be good here, but I think I will keep it simple and write a for loop.” In reality you don’t trust yourself to not screw it up under pressure. Over the years you’ve learnt that all you really need to do is make the unit tests pass and not try to steal any office equipment during the interview and you will probably get the job. In response to your sound imitation of wisdom, the interviewer’s eyes light up and they nod slightly. You’re in.