🥐 Hey there! It’s Platform Weekly, the John Wick of cloud native newsletters. Each sequel issue is better than the last. Let's get bakin'

Also, want to know how your organization stacks up against platform engineering best practices? Take this 12-question Platform Maturity Survey to find out. (And, no, you don’t even have to share your email to get the results.)

Want more Platform Engineering and DevOps news, analysis, and resources? Consider signing up for a newsletter from our friends over at The New Stack. They have daily and weekly newsletter options to keep devs, engineers, and the cloud native community in-the-know.

5 benefits of monitoring and managing your Internal Developer Platform

Colleen Green, Tanzu Labs at VMWare

This newsletter was adapted from this blog post.

Gardening isn't just for plants; it's for internal developer platforms too! Monitoring and managing your internal developer platform is essential to the application development and delivery process. Putting in the effort to maintain and optimize your platform, and keeping an eye on potentially disruptive changes, will keep your garden safe and pests out.

There are numerous benefits to building a monitoring and management system for your platform. These include:

1️⃣ Improved system performance

By monitoring for bottlenecks, spotting trends and identifying potential issues well before they impact performance, engineering teams can implement proactive strategies to ensure optimum availability, reliability, scalability, and response times. Further, regular maintenance such as patching and upgrading hardware and applications will also ensure greater efficiency and user experience. All of this puts organizations in a better position to reduce system disruptions, anticipate future needs and, ultimately, achieve greater efficiency and profitability.

2️⃣ Cost reduction

Optimizing and managing an application platform properly can offer organizations various ways to cut costs. Keeping an eye on the performance and use of the platform can unveil areas for optimization, leading to enhanced resource utilization and decreased hardware and license fees. Proactively keeping tabs on the application’s health can spot prospective issues quickly, saving the costly consequences of service outages and downtimes. Moreover, thorough monitoring and oversight allow IT teams to discern where resources can be redistributed to make a budget-friendly difference.

3️⃣ Scalability

Through monitoring, platform engineers can achieve scalability more efficiently by using data to replicate or adjust resources according to performance and capacity demands. This helps streamline the manual scaling and compliance management processes, as well as supply in-depth proficiency into application performance, resource utilization, and user experience over multiple servers and services.

4️⃣ Enhanced security

Security is a main concern in platform engineering. By providing guardrails for developers through golden paths, and tracking various application activities engineers can detect any malicious or unauthorized activity, making it easier to respond quickly and mitigate the potential damage. Monitoring can also provide insights into what security measures may need to be implemented or improved.

5️⃣ Improved feedback loops

Developers can keep their applications running at their best and continually monitor user engagement by leveraging feedback loops that collect usage data on performance in production. With this centralized control in hand, engineers can make modifications to their apps with speed and efficiency, and then see the results instantly. Automated alerting and notifications will also help engineers quickly detect and repair any issues that may arise.

Here are some quick ways to start monitoring and managing your application platform for more optimal results:

🔐 Secure your deployment environment

Start by securing the server and network configuration of the application platform and make sure the system is updated with the latest security patches. You can also implement security practices such as controlling access to the platform, monitoring for suspicious activity, and staying up to date on threats and vulnerabilities.

📈 Establish system baselines

After assessing your platform’s system components, establish baseline measurements of hardware and software configurations, as well as usage statistics. Next, define a baseline of acceptable performance metrics, which could include average uptime, system response times, network throughput, or any other relevant metrics. Set up a system to meet the defined baseline performance by adjusting system configurations, fine-tuning hardware, adding or removing resources, or optimizing code.

Establish alerting rules

Identify the events that you want to trigger an alert, and determine the best way to receive them. Use the platform-specific tools available in your software to create the alerting rules. Depending on the platform, you may need to define the conditions for when the alert will be triggered (such as a threshold for a certain metric, or an event name or type). Monitor these new rules for accuracy, and adjust as needed.

🔍 Monitor application performance

Within the application architecture, identify the components, how they are connected, and their overall flow. Monitor the application layer and capture performance metrics, such as request latency, errors, and response times. Configure the monitoring system to alert you when certain thresholds are breached.

⚙️ Automate processes

Take time to explore monitoring and management automation solutions–cloud-based services, open source, and enterprise-level solutions are all viable options. After finding the best tool and technology for your needs, begin setting up your monitoring and management system by configuring the monitoring agents, collecting data, storing logs, and establishing alerts. Once you have a basic system in place, review the performance of your automation periodically to ensure accurate data and efficient operation.

Learn more by watching Michael Coté and Bryan Ross’ PlatformCon 2023 sessions.

Share | Tweet | Forward

Short on time? ⏳ We got you 🥐😋

🥐 Google released a paper on how they define, measure, and manage technical debt. But if you don’t want to read the whole thing, check out Abi Noda’s 🔥 recap.

🥐 Did you attend PlatformCon 2023? If so, let us know what you thought about the conference. Your feedback helps us make our community events better. 💪

🥐 Realtor.com’s Suzy Julius shared how the U.S. real estate site transformed their culture into one that embraces platform engineering. Check out her PlatformCon talk or this blog recap.

And that’s a wrap on this week! As always, this newsletter is a community project. So if you have anything awesome to share from the cloud-native world, send it our way.

Stay crunchy 🥐

Luca