willvarfar 3 days ago

I've spent a lot of time jumping in to optimise horrendously complicated programs and data pipelines and things. The big wins are from understanding the domain and spotting all the inevitable misunderstandings and unnecessary steps that creep in when systems are so big they get split up and built/maintained by different people at different times.

  • pyfon 2 days ago

    This! Extending the list of 4:

    5. Remove unrequired behaviour.

    6. Negotiate the required behaviour with stakeholders.

    7. UX changes. E.g. make a synchronous flow a background job and notification. Bring back quick parts of the operation sooner (e.g. like a progressive jpg)

    8. Architecture changes. E.g. monolithification, microservification. Lambdas vs. VM vs. Fargate etc.

    And some more technical:

    9. Caches?

    10. Scalability, more VMs

    11. Move compute local (on client, on edge comoute, nearby region)

    11a. Same as 11 for data residency.

    12. Data store optimisation e.g. indices, query plans, foreign keys, consistent hashing (arguably a repeat of data structures)

    13. Use a data centre for more bang/buck than cloud.

    14. Compute type: use GPU instead of CPU etc. I'll bundle here L1 cache etc.

    15. Look at sources of latency. proxies, sidecars, network hops (and their location) etc.

    16. GC pauses, event loops, threading, other processed etc.

sirwhinesalot 3 days ago

This is missing the 5th and most important optimization technique: don't pessimize the code in the first place.

The usual culprit is "premature modularization", where code that is used in one place and is never going to be extended is nonetheless full of indirections.

  • genman 3 days ago

    In principle it can in general fit the points 1-3 when you view less abstractions as lower level system and code as a data structure and algorithm, what can also include different levels of parallelism.

    • Supermancho 2 days ago

      > In principle it can in general fit the points 1-3

      In principle, I don't think people would lump it in.

pluto_modadic 3 days ago

Optimize for maintainable (readable) code with sane, understandable flows. then optimize for the user experience. Processors complete lifetimes when a human blinks. If you're doing /meaningful/ work with those cycles, that's fine. If it's bloat... trim bloat.

bytepoet 2 days ago

The blog post is really good. I see that there's a follow-up piece 'The Fifth Kind of Optimisation' about parallelism.

Something that I'd like to add is that it's helpful to understand the optimization capabilities of our compiler. Ideally, we would like to write program that doesn't have what are called 'optimization-blockers' - these make it hard for the compiler to generate optimized executables.

I like the pointer to the blog on accidentally quadratic implementations. I find that the following pattern is often a landmine:

for (int i = 0; i < strlen(s); i++) // code in loop

strlen(s) gets computed every iteration, incurring O(n) time.

Finally, being aware that I/O latencies are major source of bottlenecks leads to nice optimizations. One advantage of multiple threads is that they can sometimes hide the I/O latency. In general, writing programs with good memory locality is one of the better levers for optimization.

DmitryOlshansky 2 days ago

Python is not something I expect to see when talking about performance. Being interpreted with powerful builtins it distorts what is fast and what isn’t in a fairly unpredictable way.

bionhoward 3 days ago

The first thing I thought when I saw your code was “what about optimizing for readability?”

  • dayvigo 3 days ago

    APL programmers would say it's already maximally optimized for readability (actually, it could be improved further with less and shorter keywords and stdlib function names). That's not even a joke or dig at APL programmers, whether that code would be more or less readable with longer, more descriptive names is entirely a matter of perspective, preference, and context.

tialaramex 2 days ago

> it may well be fast enough, which is what really matters.

This deserves to be a headline in most optimisation discussions. Fast enough or small enough is often all that matters, start there.

genman 3 days ago

5. Use a higher level system that can offer better optimizations.

This can include more advances VM or using other advanced problem oriented systems like databases or problem solvers.