Why Bare Metal Monitoring is Critical for Your Dedicated Server
Dedicated servers from Valebyte offer unparalleled performance, control, and security. However, with great power comes the responsibility of meticulous management. Monitoring your bare metal server with tools like Prometheus and Grafana is not just an option; it's a necessity for:
- Proactive Issue Resolution: Identify potential problems like high CPU usage, low disk space, or network bottlenecks before they impact your services.
- Performance Optimization: Understand resource utilization patterns to fine-tune your applications, databases, or game servers for maximum efficiency.
- Uptime Assurance: Receive alerts for critical events, ensuring you can respond quickly to maintain service availability for your users, clients, or players.
- Capacity Planning: Gather historical data to make informed decisions about scaling your infrastructure as your needs grow.
- Security Auditing: Monitor unusual network activity or resource spikes that could indicate a security incident.
Prometheus is an open-source monitoring system with a dimensional data model, flexible query language (PromQL), efficient time-series database, and a modern alerting approach. Grafana, on the other hand, is a leading open-source platform for monitoring and observability, allowing you to query, visualize, alert on, and explore your metrics no matter where they are stored.
Prerequisites and Server Requirements
Before we dive into the installation, ensure your Valebyte dedicated server meets the following requirements:
Operating System
- A fresh installation of a Linux distribution. This guide will provide commands for both Ubuntu/Debian and CentOS/RHEL.
Hardware Resources
Prometheus and Grafana are relatively lightweight for basic monitoring, but resource consumption scales with the number of metrics collected and the retention period. For a typical dedicated server monitoring setup:
- CPU: 2+ Cores (modern multi-core CPUs are efficient).
- RAM: 2GB+ (Prometheus uses RAM for its in-memory index; Grafana also needs some).
- Storage: 50GB+ SSD (Prometheus stores time-series data; SSDs are highly recommended for performance).
Network Access
Ensure the following ports are open in your server's firewall (and potentially your network security groups):
- 9090/TCP: Prometheus web UI and API.
- 3000/TCP: Grafana web UI.
- 9100/TCP: Node Exporter (for host metrics).
- 22/TCP: SSH (for remote access, which you're likely using).
User Privileges
- Sudo access or root privileges for installing packages and managing services.
Step-by-Step Installation Guide
Let's begin setting up your monitoring stack.
Step 1: Prepare Your Bare Metal Server
First, update your system's package list and installed packages. This ensures you have the latest security patches and dependencies.
For Ubuntu/Debian:
sudo apt update
sudo apt upgrade -y
sudo apt install -y wget curl apt-transport-https software-properties-common
For CentOS/RHEL:
sudo yum update -y
sudo yum install -y wget curl
Next, create dedicated system users for Prometheus and Node Exporter. This is a security best practice, limiting the privileges of these services.
sudo useradd --no-create-home --shell /bin/false prometheus
sudo useradd --no-create-home --shell /bin/false node_exporter
Step 2: Install Prometheus Server
Prometheus is distributed as pre-compiled binaries. We'll download the latest stable version, extract it, and set up a systemd service.
1. Download Prometheus: Find the latest stable version on the Prometheus download page. Replace the version number if a newer one is available.
cd /tmp
wget https://github.com/prometheus/prometheus/releases/download/v2.48.1/prometheus-2.48.1.linux-amd64.tar.gz
tar xvfz prometheus-2.48.1.linux-amd64.tar.gz
2. Move Binaries and Set Permissions:
sudo mkdir -p /etc/prometheus
sudo mkdir -p /var/lib/prometheus
sudo cp prometheus-2.48.1.linux-amd64/prometheus /usr/local/bin/
sudo cp prometheus-2.48.1.linux-amd64/promtool /usr/local/bin/
sudo chown prometheus:prometheus /usr/local/bin/prometheus
sudo chown prometheus:prometheus /usr/local/bin/promtool
sudo cp -r prometheus-2.48.1.linux-amd64/consoles /etc/prometheus
sudo cp -r prometheus-2.48.1.linux-amd64/console_libraries /etc/prometheus
sudo chown -R prometheus:prometheus /etc/prometheus/consoles
sudo chown -R prometheus:prometheus /etc/prometheus/console_libraries
sudo chown prometheus:prometheus /var/lib/prometheus
3. Create Prometheus Configuration File:
Create /etc/prometheus/prometheus.yml with the following basic configuration. This configures Prometheus to monitor itself.
sudo nano /etc/prometheus/prometheus.yml
global: scrape_interval: 15s # How frequently to scrape targets evaluation_interval: 15s # How frequently to evaluate rules alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 rule_files: # - "first_rules.yml" # - "second_rules.yml" scrape_configs: - job_name: 'prometheus' static_configs: - targets: ['localhost:9090'] # Prometheus server itself
Set appropriate permissions for the configuration file:
sudo chown prometheus:prometheus /etc/prometheus/prometheus.yml
4. Create a Systemd Service File for Prometheus:
Create /etc/systemd/system/prometheus.service:
sudo nano /etc/systemd/system/prometheus.service
[Unit] Description=Prometheus Wants=network-online.target After=network-online.target [Service] User=prometheus Group=prometheus Type=simple ExecStart=/usr/local/bin/prometheus \ --config.file /etc/prometheus/prometheus.yml \ --storage.tsdb.path /var/lib/prometheus \ --web.console.templates=/etc/prometheus/consoles \ --web.console.libraries=/etc/prometheus/console_libraries \ --web.listen-address=:9090 [Install] WantedBy=multi-user.target
5. Start and Enable Prometheus:
sudo systemctl daemon-reload
sudo systemctl start prometheus
sudo systemctl enable prometheus
sudo systemctl status prometheus
Verify that Prometheus is running by checking its status. You should see active (running). You can also access the Prometheus UI in your web browser at http://YOUR_SERVER_IP:9090.
Step 3: Install Node Exporter (for Host Metrics)
Node Exporter exposes a wide range of hardware and OS metrics (CPU, memory, disk I/O, network statistics) from your bare metal server. This is crucial for understanding your server's health.
1. Download Node Exporter: Find the latest stable version on the Prometheus download page.
cd /tmp
wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz
tar xvfz node_exporter-1.7.0.linux-amd64.tar.gz
2. Move Binary and Set Permissions:
sudo cp node_exporter-1.7.0.linux-amd64/node_exporter /usr/local/bin/
sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter
3. Create a Systemd Service File for Node Exporter:
Create /etc/systemd/system/node_exporter.service:
sudo nano /etc/systemd/system/node_exporter.service
[Unit] Description=Node Exporter Wants=network-online.target After=network-online.target [Service] User=node_exporter Group=node_exporter Type=simple ExecStart=/usr/local/bin/node_exporter --web.listen-address=:9100 [Install] WantedBy=multi-user.target
4. Start and Enable Node Exporter:
sudo systemctl daemon-reload
sudo systemctl start node_exporter
sudo systemctl enable node_exporter
sudo systemctl status node_exporter
You can verify Node Exporter is working by visiting http://YOUR_SERVER_IP:9100/metrics in your browser. You should see a large amount of plain text metrics.
5. Add Node Exporter to Prometheus Configuration:
Edit /etc/prometheus/prometheus.yml to add Node Exporter as a target:
sudo nano /etc/prometheus/prometheus.yml
Add the following scrape_configs entry:
scrape_configs: - job_name: 'prometheus' static_configs: - targets: ['localhost:9090'] - job_name: 'node_exporter' static_configs: - targets: ['localhost:9100'] # Node Exporter running on the same server
6. Reload Prometheus Configuration:
sudo systemctl reload prometheus
Go to the Prometheus UI (http://YOUR_SERVER_IP:9090), navigate to 'Status' -> 'Targets'. You should now see both 'prometheus' and 'node_exporter' jobs listed with a 'UP' state.
Step 4: Install Grafana
Grafana will fetch data from Prometheus and present it in beautiful, customizable dashboards.
For Ubuntu/Debian:
sudo wget -q -O /usr/share/keyrings/grafana.key https://apt.grafana.com/gpg.key
echo "deb [signed-by=/usr/share/keyrings/grafana.key] https://apt.grafana.com stable main" | sudo tee /etc/apt/sources.list.d/grafana.list
sudo apt update
sudo apt install grafana -y
For CentOS/RHEL:
sudo nano /etc/yum.repos.d/grafana.repo
[grafana] name=grafana baseurl=https://rpm.grafana.com repo_gpgcheck=1 enabled=1 gpgcheck=1 gpgkey=https://rpm.grafana.com/gpg.key sslverify=1 sslcacert=/etc/pki/tls/certs/ca-bundle.crt
sudo yum install grafana -y
Start and Enable Grafana:
sudo systemctl daemon-reload
sudo systemctl start grafana-server
sudo systemctl enable grafana-server
sudo systemctl status grafana-server
Grafana should now be running. Access its web UI at http://YOUR_SERVER_IP:3000. The default login is admin / admin. You will be prompted to change the password on your first login – DO THIS IMMEDIATELY!
Step 5: Configure Grafana to Visualize Prometheus Data
1. Add Prometheus as a Data Source:
- Log in to Grafana (
http://YOUR_SERVER_IP:3000). - Click the gear icon (Configuration) on the left sidebar, then select 'Data sources'.
- Click 'Add data source'.
- Choose 'Prometheus' from the list.
- In the 'HTTP' section, set the 'URL' to
http://localhost:9090(since Prometheus is on the same server). - Scroll down and click 'Save & Test'. You should see a message confirming 'Data source is working'.
2. Import a Pre-built Dashboard for Node Exporter:
Grafana offers a vast library of community-contributed dashboards. A popular one for Node Exporter is ID 1860.
- Click the '+' icon on the left sidebar, then select 'Import'.
- In the 'Import via grafana.com' field, enter
1860and click 'Load'. - On the next screen, select your Prometheus data source from the dropdown.
- Click 'Import'.
You should now see a comprehensive dashboard displaying metrics from your bare metal server, covering CPU, memory, disk I/O, network traffic, and more. This provides immediate insights into your server's health, crucial for managing everything from game servers to complex web applications.
Step 6: Secure Your Monitoring Stack
Exposing monitoring tools to the internet without proper security is a significant risk. Implement these steps for production environments.
1. Configure Your Firewall:
Only open necessary ports and restrict access to trusted IP addresses where possible.
For UFW (Ubuntu/Debian):
sudo ufw allow ssh
sudo ufw allow 9090/tcp # Prometheus (consider restricting to specific IPs)
sudo ufw allow 3000/tcp # Grafana (consider restricting to specific IPs)
sudo ufw allow 9100/tcp # Node Exporter (consider restricting to Prometheus server IP)
sudo ufw enable
For Firewalld (CentOS/RHEL):
sudo firewall-cmd --add-service=ssh --permanent
sudo firewall-cmd --add-port=9090/tcp --permanent # Prometheus
sudo firewall-cmd --add-port=3000/tcp --permanent # Grafana
sudo firewall-cmd --add-port=9100/tcp --permanent # Node Exporter
sudo firewall-cmd --reload
Important: For Prometheus and Grafana, it's highly recommended to restrict access to a specific range of trusted IPs or proxy them behind a web server with authentication and HTTPS.
2. Secure Grafana:
- Change Default Admin Password: You've already done this (hopefully!).
- Enable HTTPS: Configure Nginx or Apache as a reverse proxy for Grafana and secure it with a Let's Encrypt SSL certificate. This encrypts traffic to your Grafana UI.
- User Management: Create specific users with appropriate roles (Viewer, Editor, Admin) instead of sharing the 'admin' account.
3. Secure Prometheus:
- Reverse Proxy with Authentication: Similarly, use Nginx or Apache to proxy requests to Prometheus's web UI (port 9090) and add basic authentication.
Need a dedicated server?
Compare prices from top providers. Configure and order in minutes.
Configuration Examples and Advanced Monitoring
Monitoring Additional Applications (Exporters)
Prometheus's power comes from its ecosystem of 'exporters'. These are small applications that expose metrics from various services in a Prometheus-readable format.
- Database Monitoring: Install
mysqld_exporterfor MySQL/MariaDB orpostgres_exporterfor PostgreSQL. - Web Server Monitoring: Use
apache_exporterornginx_exporterto get metrics on HTTP requests, connections, etc. - Custom Applications: Use Prometheus client libraries (available for Go, Java, Python, Ruby, Node.js) to instrument your own applications and expose custom metrics.
After installing an exporter, you'll need to add it as a new job_name in your /etc/prometheus/prometheus.yml file, similar to how you added Node Exporter. Remember to reload Prometheus (sudo systemctl reload prometheus) after any configuration changes.
Alerting with Alertmanager
While Grafana can send basic alerts, Prometheus's Alertmanager handles more sophisticated alerting. It deduplicates, groups, and routes alerts to various notification channels (email, Slack, PagerDuty, etc.).
Basic Alertmanager Setup:
- Download Alertmanager: Similar to Prometheus, download the latest binary from the Prometheus download page.
- Configure Alertmanager: Create
/etc/alertmanager/alertmanager.ymlto define receivers and routing. - Create Systemd Service: Set up a systemd service for Alertmanager.
- Integrate with Prometheus: Add the Alertmanager's address to the
alertingsection of your/etc/prometheus/prometheus.yml. - Define Alerting Rules: Create
.ymlfiles (e.g.,/etc/prometheus/rules.yml) with PromQL expressions that define when an alert should fire, and include them inrule_filesinprometheus.yml.
This allows you to be instantly notified if your dedicated server experiences high CPU load, critical disk space usage, or if your game server's processes stop responding. This level of proactive monitoring is invaluable for maintaining high availability for services like web hosting, mail servers, and CI/CD pipelines.
Testing and Verification
- Prometheus Targets: In the Prometheus UI (
http://YOUR_SERVER_IP:9090), go to 'Status' -> 'Targets'. Ensure all configured targets (prometheus, node_exporter, and any other exporters) show 'UP'. - Prometheus Graph: Use the 'Graph' tab in Prometheus to run simple queries, e.g.,
node_cpu_seconds_totalorprometheus_build_info, to ensure data is being collected. - Grafana Dashboards: Confirm that your imported Node Exporter dashboard (ID 1860) is populating with live data. Try adjusting the time range to see historical data.
- Alerting (if configured): If you've set up Alertmanager, create a test alert rule designed to fire immediately (e.g.,
vector(1)) to ensure notifications are sent correctly.
Troubleshooting Common Issues
Even with careful setup, issues can arise. Here's how to diagnose and fix common problems:
Service Not Starting
If Prometheus, Node Exporter, or Grafana fail to start:
- Check Logs: The most important step. Use
journalctlto view service logs.sudo journalctl -xeu prometheus sudo journalctl -xeu node_exporter sudo journalctl -xeu grafana-server - Permissions: Ensure the service users (
prometheus,node_exporter) have read/write access to their respective directories (/etc/prometheus,/var/lib/prometheus, etc.). - Configuration Syntax: For Prometheus, run
promtool check config /etc/prometheus/prometheus.ymlto validate your configuration.
Prometheus Not Scraping Targets ('DOWN' Status)
- Firewall: Ensure the target's port (e.g., 9100 for Node Exporter) is open on the server where the exporter is running.
- IP Address/Port: Double-check the
targetsentry inprometheus.yml. It must match the exporter's actual listen address. - Exporter Running: Verify that the exporter service itself is running on the target server (e.g.,
sudo systemctl status node_exporter). - Network Connectivity: From the Prometheus server, try to
curl http://TARGET_IP:PORT/metricsto see if it's reachable.
Grafana Not Connecting to Prometheus Data Source
- Prometheus Running: Ensure the Prometheus server is active and accessible from the Grafana server.
- Firewall: Confirm that Grafana can reach Prometheus on port 9090 (if on separate servers, or if a strict local firewall is in place).
- URL in Data Source: Verify the 'URL' in Grafana's Prometheus data source configuration is correct (e.g.,
http://localhost:9090orhttp://PROMETHEUS_SERVER_IP:9090).
'No Data' in Grafana Dashboards
- Time Range: Check the time range selector in Grafana. Is it set to 'Last 5 minutes' but your data is older?
- Prometheus Data: Can you see the data directly in the Prometheus UI's 'Graph' tab using the same PromQL queries? If not, Prometheus isn't collecting it.
- Data Source Selected: Ensure the correct Prometheus data source is selected in the dashboard's panel configuration.
- Query Errors: Check the 'Query Inspector' in Grafana for any errors in the PromQL queries used by the dashboard panels.
High Resource Usage by Monitoring Stack
- Prometheus Retention: By default, Prometheus stores data for 15 days. If you need longer, adjust
--storage.tsdb.retention.timeor--storage.tsdb.retention.sizein its systemd service file. Be mindful of disk space. - Scrape Intervals: If you're scraping hundreds of targets every second, Prometheus will consume more resources. Adjust
scrape_intervalinprometheus.ymlif necessary. - Number of Exporters/Metrics: Each exporter adds metrics. Only enable exporters and collect metrics that are genuinely useful.
Need a dedicated server?
Compare prices from top providers. Configure and order in minutes.
Real-World Use Cases for Dedicated Server Monitoring
Implementing Prometheus and Grafana on your Valebyte dedicated server empowers you to optimize and maintain a wide array of mission-critical applications:
- Game Servers: Monitor CPU, RAM, network latency, and disk I/O to ensure smooth gameplay, detect DDoS attacks, and scale resources efficiently for popular titles.
- High-Traffic Web Hosting: Keep an eye on web server performance (Nginx, Apache), database queries, and application response times to guarantee fast loading speeds and high availability for your websites.
- Robust Databases: Track query performance, connection counts, disk usage, and replication status for MySQL, PostgreSQL, or MongoDB, preventing slowdowns and data loss.
- Reliable Mail Servers: Monitor queue lengths, connection attempts, disk space, and resource usage for Postfix, Dovecot, or Exchange to ensure email delivery and prevent spam issues.
- Streaming Platforms: Oversee bandwidth utilization, concurrent user counts, CPU load from transcoding, and storage performance to deliver seamless media experiences.
- CI/CD Pipelines: Track resource consumption of build agents, job execution times, and disk space during continuous integration and deployment processes, optimizing your development workflow.
- Big Data Processing: Monitor resource allocation, job progress, and cluster health for Spark, Hadoop, or other big data frameworks running on your dedicated infrastructure.
- Virtualization Hosts: If you're running VMs on your bare metal server, monitor the host's overall health and resource distribution to ensure optimal performance for all guests.