Microservices: An Approach to Scalable Software Development

Why Startups Are Agile: The Growth Story

Imagine a startup in its early days. Suppose it’s a ride-sharing company, similar to Uber. The startup might have a grand vision of creating autonomous flying cars that pick people up from their rooftops and drop them off at their destinations. However, to turn this vision into reality, the company needs a lot of resources: money, people, and advanced technology.

So, do they abandon their grand vision? Not at all! They start with a smaller, manageable version of their idea. For example, instead of flying cars, they begin with regular cars driven by humans. They might not even have all the resources needed to fully develop this idea yet, so they focus on creating a basic version of their product that adds value to users. This basic version is called the Minimum Viable Product (MVP).

In the early stages, startups often have very few people. It’s common to see startups with just one or two founders, who handle everything themselves. For instance, one founder might focus on sales while the other handles all aspects of development—designing, coding, testing, and managing operations.

Here’s how it works:

Developing the MVP: The engineer does everything from planning to coding to testing and operating the MVP. With today’s technology, this is becoming easier to manage.
Launching the MVP: Once the basic product is ready, they release it to users. They then check if users like it and whether it’s gaining traction.
Getting Feedback: If the MVP shows promise, they move forward with their plan. If not, they tweak the product based on user feedback and try again.
Seeking Funding: Startups often approach Venture Capitalists (VCs) for funding. They present what they’ve built, show how users are reacting, and discuss their vision and needs. If VCs are impressed, they provide the funds needed to grow.

Why are startups so agile? Agility means the ability to quickly respond to customer needs. Startups with a small team can quickly adapt and make changes. For example, if a client asks for a new feature, a small, agile team can often add it quickly, impressing the client with their responsiveness.

But what happens as the startup grows? Once the startup gets funding, it starts hiring more people to speed up development. The team might grow from a few engineers to 50 or more, with specialized roles for client-side development, server-side development, quality assurance, and operations.

As the team grows, the startup’s agility might decrease. More people and more roles mean more communication and coordination are needed. This can slow down decision-making and response times, making the organization less agile compared to its early days when a few people could quickly make and implement decisions.

Characteristics of Monolithic Applications

In a monolithic application, all features and functions are combined into one large, unified system. Here’s how it typically works:

Single Application: The entire product is built as one big application. This means that all functionalities and components are tightly integrated into a single codebase.
Unified Development Team: A single, large team manages the entire software development lifecycle. This team handles everything from planning and coding to testing and deployment.
Single Technology Stack: The whole application uses one technology stack. This includes a single set of programming languages, frameworks, and tools across the entire application.
Centralized Server: The application runs on a single application server (or a few servers working together). This server handles all the application logic for the entire product.
One Deployment Unit: The whole application is deployed as one unit. This means that any updates or changes to the application involve deploying the entire system at once.

In summary, a monolithic application is a cohesive, all-in-one system where everything is interconnected and managed together. This approach can simplify development but can also pose challenges as the application grows.

Let’s take a cab booking app as an example. In a monolithic system, all the features—such as booking a cab, processing payments, and managing users—would be part of a single, massive codebase.

Challenges in a Monolithic Application, aka Why Agility Reduces Over Time?

As a monolithic application grows, it often faces significant challenges that impact its agility. Initially, developing features might have seemed quick and straightforward. Perhaps an engineer could add a feature overnight. However, as the application expands, even small changes can become time-consuming. What once took hours now takes days or weeks. This slowdown is a common issue as monolithic applications evolve.

There’s a well-known principle from the book "The Mythical Man-Month" by Frederick P. Brooks, which highlights a common problem in software development:

Adding manpower to a late software project, makes it later.

This means that simply adding more people to a project that is already behind schedule doesn’t necessarily speed up progress. In fact, it can often make things more complicated and delay the project further.

As a monolithic application evolves, several issues can arise:

Larger Codebase

As your organization and team grow, so does the size of the codebase. More people contribute code, leading to a more complex application. This results in:

Lack of Comprehensive Understanding: With a growing team working on a cab booking application, no single person may fully understand the entire codebase. For example, while one engineer might focus on the cab booking feature, another might handle payment processing, making it challenging for anyone to grasp the whole system.

Outdated Architecture

Initially, the architecture might have been designed for a simple feature, like booking a cab from point A to point B. As your team expands and you introduce new features, the initial architecture may no longer be suitable:

Evolving Requirements: Suppose you start with a basic cab booking system. As your business grows, you might introduce features such as hourly rentals, outstation travel, auto-rickshaw bookings, food delivery, and a wallet system. Each new feature requires adjustments to the existing architecture, which was originally designed for just one type of service.
Architectural Mismatch: If your architecture was designed with only cab bookings in mind (Plan A), it may struggle to support additional services like food delivery and wallet functionalities (Plan B). This is similar to building a foundation for a two-floor house and later trying to add eight more floors. While such a mismatch is unacceptable in civil engineering, it is a common issue in software development.
Increased Complexity: As you layer new features on top of the outdated architecture, your codebase becomes increasingly buggy and difficult to maintain. For instance, integrating food delivery and wallet functionalities into the original cab booking system might result in a complex and patchy codebase.

Ad-Hoc Technology Choices

In the early stages, technology choices are often made based on convenience rather than strategic planning:

Prototype Technology: For example, you might initially choose Python because your team is familiar with it, while building a minimal viable product (MVP) for cab bookings. However, as your application grows, you might realize that Golang would be more suitable for handling additional features.
Difficult Transition: Transitioning from Python to Golang for a large codebase can be challenging. You now have a million lines of Python code supporting various features, and rewriting it in Golang is a daunting task, even if you believe Golang is a better choice for the future.

In summary, as your monolithic cab booking application grows, it encounters challenges related to an expanding codebase, outdated architecture, and technology choices made under different circumstances. Addressing these issues requires careful consideration of how to evolve the system effectively while managing the complexity of adding new features.

Larger the Team, More Communication, Greater the Confusion

As your team grows, communication becomes more complex, leading to potential confusion and delays. For instance, imagine you’re working on adding an auto-rickshaw booking feature to your cab booking application:

Communication Challenges: The client-side developer responsible for the auto-rickshaw booking interface needs a specific API from the server-side developer. However, the server-side developer can't implement this API change until another change is made to the existing cab booking system by the client-side developer. Unfortunately, the client-side developer is currently occupied with fixing a bug in the car booking feature.
Resulting Delays: This dependency and the need for coordinated changes can lead to delays in launching the new auto-rickshaw booking feature. As more features and developers are involved, the complexity of communication and coordination increases, often resulting in slower progress and more confusion.

Technical Debt Increases Over Time

Technical debt refers to the compromises and shortcuts taken during development that might need to be addressed later. Experienced engineers often encounter these situations:

Shortcuts Taken: You might hear experienced developers say things like:
- "This isn’t the best way to write this code, but I’ll handle it for now and come back to fix it later. I’ll add a TODO/FIXME comment to remind myself."
- "That TODO/FIXME comment has been in our codebase for ages! We’re not sure how to fix it now since the original developer who wrote it has left."
- "I would like to write more tests, but we don’t have time right now. I’ll add them later."
Impact of Technical Debt: Over time, these shortcuts accumulate and lead to increased technical debt. For instance, if the cab booking system was hastily built with quick fixes and without adequate testing, it may work initially but become increasingly unstable and harder to maintain as new features are added. This growing technical debt makes the architecture more fragile and harder to evolve.

How Evolving Technology Affects Agility

Imagine you started building your cab booking application in 2014 using the best technology available at that time. A few years later, a new technology called Kubernetes becomes popular. Since Kubernetes wasn’t around in 2014, you couldn’t use it when you first built your application. Similarly, there’s no guarantee that Kubernetes—or any technology—will be relevant 5-10 years from now.

Technology is always changing. The best tools and technologies of today might not be the best tomorrow. For example, you may have started with Java 8 and now want to upgrade to Java 15. Or you began with Angular 4 and now need Angular 12 for new features.

Do you need to keep up with new technology?

Sticking with outdated technology isn’t an option because older tools often stop receiving updates and support. This can make your application vulnerable to bugs and security issues. As technology evolves, you might need to rewrite parts of your application to stay current and secure.

Operational Challenges with Scaling and Failure Handling

When you have a monolithic application (where everything is built as one large system), scaling and managing failures become tricky:

Scaling Challenges: If your cab booking app suddenly gets a lot of users, you can’t just scale the cab booking feature. Instead, you have to scale the entire application, including features like auto-rickshaw booking, food delivery, and wallet management. This can waste resources and complicate things.
Failure Impact: If one part of the application fails, such as the food delivery feature, it can affect the whole system. For example, if the food delivery component has a problem, it might slow down or disrupt the cab booking service, leading to a poor experience for users.

How Issues in One Part Can Affect the Entire Application

Imagine your cab booking app also has a food ordering feature. If the food ordering part has a problem, like a memory leak, it can cause issues for the entire app. Even though food ordering isn’t your main focus—cab bookings are—both features are part of the same large application.

Here’s what happens:

Memory Leak Issue: A memory leak in the food ordering feature means that the app is using up more and more memory over time without releasing it. This problem can slow down or even crash the entire application.
Impact on Cab Bookings: Since everything is connected in one big system, the problem with the food ordering feature can lead to delays in cab bookings. Your customers might experience slower service or interruptions, which can negatively impact your business.

Ideally: If the food ordering feature has a problem, you’d want to be able to fix or turn off that part without affecting the cab booking service. But in a monolithic application, this is difficult to achieve because all parts are tightly linked together.

Deployments and Rollbacks Affect the Whole Application

In a monolithic application, deployments and rollbacks are done for the entire application, not just a part of it. This means:

Deployment Challenges:
- When you need to deploy a new feature or fix a bug, you must deploy the entire application. This can be complex and time-consuming, especially if the application is large and comprises many different components.
Rollback Difficulties:
- If something goes wrong after a deployment, rolling back to a previous version involves reverting the entire application to its previous state. This process can be risky and may introduce other issues, as the rollback affects all components of the application, not just the problematic part.

This all-or-nothing approach can lead to downtime or instability if the deployment or rollback does not go as planned. It’s a significant drawback because even a small change can impact the whole application, making it challenging to manage and deploy updates safely.

Security Issues in One Part Affect the Whole Application

In a monolithic application, a security issue in one part of the application can compromise the entire system. Here’s how:

Widespread Impact:
- Since all components are tightly integrated, a vulnerability in one part—like a bug in the food ordering module—can potentially expose the entire application to security risks. For example, if the food ordering module has a security flaw, an attacker might exploit it to access sensitive data or disrupt the entire application.
Hard to Contain Threats:
- Containing and fixing security issues is challenging because you have to address the vulnerability within the context of the entire application. Unlike microservices, where issues are more contained within individual services, a monolithic application requires comprehensive solutions to mitigate security threats across the whole system.
Update and Patch Management:
- Applying security updates or patches also affects the entire application. This means you have to ensure that all components of the application are compatible with the new updates, which can be complex and time-consuming.

Microservices - A proposed alternative to monolithic applications

Some important things about microservices:

Application servers are separated
Other application components like databases, caching servers etc are also separated
State separation - single authority of data, single source of truth for any given entity
Service to service communication happens on well defined interfaces. One service has no business accessing the backend components of another service, eg: database
The teams for the microservices are separated - the teams have their own autonomy on project planning. The planning, development, testing, building, deploying and managing each microservice is independent from others.
Teams have their autonomy on technology choices - programming languages, tools, libraries and frameworks used.
The teams don't need to take each others' permissions for modifications or to change the technologies
The teams are much smaller so they are more agile
Codebases are much smaller - so technology evolution and rewrites are more manageable
In order to handle cross dependencies of APIs, APIs are versioned. As APIs evolve, older ones are deprecated before removal. There is a certain window of time when APIs are marked as deprecated before they are removed - this gives the dependent teams to make necessary changes to their application to accommodate the evolution.
The end users don't see our application as a bunch of microservices - they see the entire thing as one application
The sessions are managed with technologies like JWT
The client side can also be split into smaller components using microfrontend architectures
Services can be further split into smaller services

Understanding Microservices through the Cab Booking Example

The monolithic architectural approach required a lot of manual work to manage and scale. Enter microservices—an architecture designed to overcome these challenges by breaking down large applications into smaller, independent pieces.

Let’s take the cab booking app as an example. Adopting a microservices architecture would mean that we split the application into smaller services, each responsible for a specific business function, such as cab bookings, payments, or user management.

How Microservices Work

Think of each microservice as its own mini-startup within the app, managing its own piece of functionality independently. For example:

Cab Service handles booking cabs, managing drivers and riders.
Payment Service deals with transactions and payment processing.
User Service stores information like user IDs, names, and email addresses.

Each service has its own database, so the cab service, payment service, and user service all manage their data separately. This separation ensures that each service can scale, update, and operate independently without affecting the others.

A Real-World Scenario

Let’s walk through a typical cab booking scenario:

A rider books a cab.
The driver picks up the rider and drops them at the destination.
The rider then pays for the trip.

In this case, the cab service will store trip information, such as the driver and rider details, in its own database. The payment service, however, will store payment-related information in its database, while the user service will keep user account details in its own database. Each service is responsible for its own piece of the puzzle.

Independent Teams, Independent Code

In a microservices architecture, each service is developed and managed by its own team. These teams can choose their preferred programming languages, frameworks, and tools without worrying about how the other services are built. This flexibility allows for faster development and easier maintenance.

For example, if a client requests details about payments for their last 10 rides, the cab service can’t access the payment service’s database directly. Instead, it must request the data from a well-defined endpoint provided by the payment service. This keeps the services isolated and secure, while still allowing them to communicate efficiently.

Data Ownership and Duplication

Each service is the authority for its own data. For example, the payment service is responsible for payment data, and any information about payments must come from its database. You might create copies of this data in other services for reports or summaries, but the payment service remains the source of truth.

Syncing Data Across Services: Event-Driven Architecture

In microservices, keeping all the services in sync can be challenging. This is where event-driven architecture comes in. Imagine each service as a part of a larger system of events. When one event happens, like a payment being made, it triggers other actions, such as updating the trip status in the cab service.

This flow of events is like a chain reaction—one action leads to another, ensuring that all the services stay up to date with the latest changes. For example, when a payment is processed, an event is triggered, which the cab service listens to and updates its records accordingly. This cycle continues, keeping everything in sync.

How "Micro" Should Microservices Be?

In a microservices architecture, deciding how to split your application into smaller services can be challenging. Let’s start with the basics: services like cab booking, payments, and user management are distinct and should be separate. But you can go even further and break these down into smaller, more specialized services.

For instance, consider these possible microservices:

Driver Location Service: This service’s sole job is to track and update the current location of drivers. It only interacts with drivers and continuously maps their GPS coordinates.
Cab Booking Service: This service handles the process of booking a cab. It manages all the details related to booking but doesn’t handle driver location or payments.

Finding the Right Level of Granularity

The key to designing effective microservices is to follow the principle of “Do one thing, but do it right.” This philosophy, borrowed from the Linux community, means that each microservice should focus on a specific piece of functionality and execute it well.

If a function can be clearly defined and managed independently, it’s a good candidate for a microservice. For example, the grep command, which searches for patterns in text, has been around for 46 years. It does one thing, but it does it exceptionally well. This makes it a reliable and enduring tool.

Designing Microservices

When designing microservices, aim to create components that:

Perform a single, well-defined task.
Can operate independently of other services.
Are robust enough to handle changes and updates with minimal intervention.

By focusing on these aspects, your microservices will be more resilient and easier to maintain over time. They’re less likely to require frequent updates or rewrites, as they are built to handle their specific function effectively.

Serverless Computing

In the world of microservices, sometimes the smallest unit of work is just a single function. This approach is known as Function as a Service (FaaS) and is a key part of serverless computing.

How Serverless Works

In a serverless architecture, when a request comes in, an API gateway acts like a traffic controller. It looks at the request path and triggers a specific piece of code to handle that request. This code runs only for the duration of the request and then disappears once it's done.

Think of it like a pod that appears to handle a specific request, does its job, and then vanishes. There’s no need for a persistent server running all the time—everything is created and destroyed on demand.

Benefits of Serverless

Scalability: If you receive a massive influx of requests, the serverless system automatically scales up by creating as many instances (or pods) as needed. For example, if you get a million requests, it will create a million pods, each handling one request at a time. Once each request is processed, the pod shuts down.
Cost Efficiency: You only pay for the time your code is running. If a pod is created to handle a request and then shuts down, you’re billed only for that short period, rather than for a constantly running server.

Modern Serverless Capabilities

While earlier serverless functions were designed to run briefly, modern serverless environments can handle tasks that take minutes. This flexibility allows you to write and deploy small, independent pieces of functionality without worrying about the underlying infrastructure.

Advantages Over Monolithic Applications

Serverless computing offers a major advantage over traditional monolithic applications:

Flexibility: It’s easier to update and manage individual functions or services compared to a massive monolithic codebase.
Adaptability: You can quickly change frameworks, languages, or other aspects of your functions without affecting the entire application.

Serverless computing is a powerful alternative to monolithic development, offering scalability, cost efficiency, and flexibility that can greatly improve how you build and manage applications.

Challenges in the Microservices World

Switching from a monolithic application to a microservices architecture introduces a new set of challenges. The tools and practices that worked for a monolithic system often don’t translate directly to a microservices setup.

The Complexity of Microservices

In a monolithic application, everything is tightly integrated into a single codebase. But in a microservices architecture, the application is divided into multiple independent services. Each service can interact with others and handle different parts of the application. As the number of these independent services increases, the complexity of managing them grows.

Key Challenges in Microservices

Increased Failure Points:
- More Moving Parts: With multiple services communicating with each other, there are many more points where things can go wrong. If one service fails or behaves unexpectedly, it can impact the entire system.
- Automation is Essential: To manage this complexity and reduce the risk of failure, automation becomes crucial. Automated tools and processes help in deploying, testing, and scaling services efficiently.
Telemetry Challenges:
- Telemetry refers to the process of collecting and analyzing data from your application to understand its behavior and performance. In a microservices environment, effective telemetry involves three key areas:
- Logging: Keeping track of events and errors that occur within each microservice. With many services running independently, gathering and aggregating logs from each service is essential for troubleshooting and understanding the overall system behavior.
- Tracing: Following the path of a request as it travels through various microservices. Tracing helps identify how requests are processed, where delays or issues occur, and how different services interact with each other.
- Monitoring: Continuously observing the health and performance of each microservice. Monitoring involves tracking metrics such as response times, resource usage, and error rates to ensure that all services are operating as expected.

Navigating These Challenges

Addressing these challenges requires specialized tools and strategies:

Automated Deployment and Scaling: Implement continuous integration and deployment (CI/CD) pipelines to automate testing and deployment processes. Use container orchestration tools like Kubernetes to manage automatic scaling and resource allocation.
Centralized Logging and Monitoring Tools: Use tools like ELK Stack (Elasticsearch, Logstash, Kibana) for logging and visualization, and Prometheus or Grafana for monitoring. These tools help aggregate data from different services and provide insights into system performance.
Distributed Tracing Solutions: Implement distributed tracing tools like Jaeger or Zipkin to track requests across services and visualize how they flow through the system.

By addressing these challenges with the right tools and practices, you can effectively manage a microservices architecture and ensure that your application remains reliable and performant.

Resources

Test Your Knowledge

No quiz available