bookmate game
Martin Kleppmann

Designing Data-Intensive Applications

Notify me when the book’s added
To read this book, upload an EPUB or FB2 file to Bookmate. How do I upload a book?
  • esandrewhas quoted4 years ago
    However, the downside of approach 2 is that posting a tweet now requires a lot of extra work. On average, a tweet is delivered to about 75 followers, so 4.6k tweets per second become 345k writes per second to the home timeline caches. But this average hides the fact that the number of followers per user varies wildly, and some users have over 30 million followers. This means that a single tweet may result in over 30 million writes to home timelines! Doing this in a timely manner—Twitter tries to deliver tweets to followers within five seconds—is a significant challenge. In the example of Twitter, the distribution of followers per user (maybe weighted by how often those users tweet) is a key load parameter for discussing scalability, since it determines the fan-out load. Your application may have very different characteristics, but you can apply similar principles to reasoning about its load.
  • Hyeonsoo Shinhas quoted4 days ago
    Operability
    Make it easy for operations teams to keep the system running smoothly.
    Simplicity
    Make it easy for new engineers to understand the system, by removing as much complexity as possible from the system. (Note this is not the same as simplicity of the user interface.)
    Evolvability
    Make it easy for engineers to make changes to the system in the future, adapting it for unanticipated use cases as requirements change. Also known as extensibility, modifiability, or plasticity.
  • Hyeonsoo Shinhas quoted17 days ago
    A fault is usually defined as one component of the system deviating from its spec, whereas a failure is when the system as a whole stops providing the required service to the user
  • Hyeonsoo Shinhas quoted21 days ago
    We call an application data-intensive if data is its primary challenge—the quantity of data, the complexity of data, or the speed at which it is changing—as opposed to compute-intensive, where CPU cycles are the bottleneck
  • Samson Mwathihas quotedlast year
    Many applications today are data-intensive , as opposed to compute-intensive
  • b9449300348has quoted2 years ago
    CPU clock speeds are barely increasing, but multi-core processors are stand
  • Peter Gazaryanhas quoted2 years ago
    A data-intensive application is typically built from standard building blocks that provide commonly needed functionality. For example, many applications need to:

    Store data so that they, or another application, can find it again later (databases)

    Remember the result of an expensive operation, to speed up reads (caches)

    Allow users to search data by keyword or filter it in various ways (search indexes)

    Send a message to another process, to be handled asynchronously (stream processing)

    Periodically crunch a large amount of accumulated data (batch processing)
  • exordiumexordiumhas quoted4 years ago
    The currently trendy style of application development involves breaking down functionality into a set of services that communicate via synchronous network requests such as REST APIs (see “Dataflow Through Services: REST and RPC”). The advantage of such a service-oriented architecture over a single monolithic application is primarily organizational scalability through loose coupling: different teams can work on different services, which reduces coordination effort between teams (as long as the services can be deployed and updated independently).
  • esandrewhas quoted4 years ago
    There is no quick solution to the problem of systematic faults in software. Lots of small things can help: carefully thinking about assumptions and interactions in the system; thorough testing; process isolation; allowing processes to crash and restart; measuring, monitoring, and analyzing system behavior in production.
  • esandrewhas quoted4 years ago
    Sometimes, when discussing scalable data systems, people make comments along the lines of, “You’re not Google or Amazon. Stop worrying about scale and just use a relational database.” There is truth in that statement: building for scale that you don’t need is wasted effort and may lock you into an inflexible design. In effect, it is a form of premature optimization. However, it’s also important to choose the right tool for the job, and different technologies each have their own strengths and weaknesses.
fb2epub
Drag & drop your files (not more than 5 at once)