How to use FluentBit for Production Log Gathering?

Log collection in production systems needs to be fast, reliable, and resource-efficient. When you’re dealing with containerized environments or distributed services, shipping logs to a central place becomes critical for monitoring and debugging.

Fluentbit is a log processor and forwarder designed to do this well. It’s lightweight, written in C, and optimized for performance. You can run it as a daemon on your nodes or as a sidecar in Kubernetes. It supports structured and unstructured logs and integrates with many systems like Elasticsearch, Loki, or S3.

Compared to heavier tools like Fluentd or Logstash, Fluentbit uses less memory and CPU, which makes it ideal for production workloads—especially in environments where resource usage matters.

This blog walks through how Fluentbit works, how to configure it, and how to use it in real production setups.

What is Fluentbit?

Fluentbit is a member of the log collectors/forwarders family, serving as a powerful logging process tool. It operates as a lightweight and highly efficient log processor, functioning as the younger sibling of Fluentd.

In other words, Fluentbit acts as a log collector, gathering logs from diverse sources such as traditional servers, Linux environments, containers, Kubernetes clusters, and pods.

It enhances the log data with contextual information, assigns labels to the logs, and transforms the log stream into a structured key-value pair format. These enriched logs can then be seamlessly sent to popular log storage solutions such as Elasticsearch, Kafka, and more.

Fluentbit is a project hosted by CNCF and licensed under the Apache License v2.0, and is a CNCF graduated sub-project within the Fluentd ecosystem. It was initially developed by Eduardo Silva, and as a community-driven initiative under CNCF, it maintains a fully vendor-neutral approach.

How Fluentbit Works?

Fluent Bit is a log processor and forwarder. It’s engineered to handle logs from edge to cloud. It works like a pipeline—logs enter, get processed, and leave.

Here’s what happens step by step:

1. Input

This is where Fluent Bit picks up your logs. It can read from:

files (like /var/log/...)
containers (e.g., Docker)
systemd
OpenTelemetry
or anything that supports a socket, HTTP, or even custom plugins.

Example:
Using the tail plugin, it follows a log file like tail -f.

2. Parser

If your logs are in plain text, Fluent Bit parses them into structured fields—usually JSON or key-value pairs.

Without this step, your logs are just blobs.

3. Filter

Filters help you clean, enrich, or drop log lines before they go anywhere else. Common use cases:

Add metadata like environment or region
Remove unwanted keys
Exclude certain logs (e.g., health checks)
Convert fields

Some useful filters:

record_modifier
grep
lua
modify

4. Storage (Buffer)

Before sending logs out, Fluent Bit buffers them in memory or a file. This helps if your destination goes down temporarily.

Buffering ensures:

Delivery reliability
Rate control
Recovery on crash (if filesystem buffering is enabled)

5. Router

The router reads the tag of each log and matches it to an output. You can send one log to multiple places using rules.

6. Output

This is where the logs go—typically:

ElasticSearch
Amazon S3
Kafka
HTTP endpoints
Another Fluent Bit or Fluentd
OpenTelemetry Collector

Plugin Overview

Type	Examples
Input	`tail`, `syslog`, `http`
Parser	`json`, `regex`, `logfmt`
Filter	`grep`, `modify`, `lua`
Output	`http`, `s3`, `elasticsearch`

Summary

Fluent Bit works by chaining these stages:

Each step is optional but common in most production setups. You define what you need in config files. Fluent Bit then handles the rest—efficiently and with low resource overhead.

Configuring Fluentbit for production log gathering

Here’s a general guide on how to use Fluentbit for production log gathering:

1. Installation and Configuration:

Install Fluentbit on the desired host or machine. You can find installation instructions specific to your operating system on the official documentation.
Configure Fluentbit by creating a configuration file (e.g., `fluent-bit.conf`) that specifies the input, filters, and output plugins you want to use. This file determines how it collects, processes, and forwards logs.
You can refer to the documentation for detailed configuration options and examples.

2. Define Input Sources:

Specify the input sources from which Fluentbit should collect logs. This can include log files, syslog, Docker containers, or other sources, depending on your setup.
Configure the appropriate input plugin in the configuration file, providing the necessary parameters such as log paths, listening ports, or environment variables.

3. Apply Filters (Optional):

Utilize filters to modify, parse, or enrich logs before forwarding them. Fluentbit offers a range of filters, such as the `grep` filter for pattern matching, `parser` filter for parsing structured logs, and `record modifier` filter for altering log records.
Choose and configure the filters that suit your specific log processing requirements.

4. Define Output Destinations:

Specify the output destinations where Fluentbit should forward the collected logs. This can include various options such as Elasticsearch, Kafka, Amazon S3, or other log aggregation systems.
Configure the appropriate output plugin in the configuration file, providing the necessary parameters like host addresses, credentials, or specific formatting options.

5. Start Fluentbit:

Once you have completed the configuration, run the Fluent Bit service or execute the Fluent Bit binary with the path to the configuration file as a parameter. For example: `fluent-bit -c fluent-bit.conf`.
Ensure that it is running as a background service or as a managed process, depending on your operating system and preferences.

6. Monitor and Troubleshoot:

Monitor Fluentbit’s logs and metrics to ensure smooth operation and identify any potential issues. It provides its own logging mechanism, and you can configure output plugins to send logs to external monitoring systems if desired.
Monitor system resources to ensure it operates within acceptable limits.

7. Scale and Optimize:

As your log volume increases or your infrastructure grows, consider scaling it horizontally to handle the load. You can distribute Fluent Bit instances across multiple machines or containers and configure them to aggregate and forward logs to centralized destinations.
Continuously optimize the configuration and plugins to improve performance and meet evolving requirements.

By following these steps, you can effectively utilize Fluent Bit for log gathering in a production environment. Remember to consult the documentation and use its extensive features and plugins to tailor the setup to your specific use case.

Configuration Steps

To configure Fluentbit for production log gathering using the HTTP output plugin and enabling storage persistence, follow these steps:

1. Open the Fluentbit configuration file (typically named `fluent-bit.conf`) for editing.

2. Add the following configuration settings under the `[SERVICE]` section:

				
					[SERVICE]
    Flush                      1
    Parsers_File              /etc/td-agent-bit/parsers.conf
    Log_Level                 error
    Storage.type              filesystem
    Storage.path              /var/log/flb_storage_
    Buffer storage.sync       normal
    Storage.checksum          On
    Storage.backlog.mem_limit 700kb
    Storage.metrics           On

The above settings configure the flushing frequency, specify the parsers file path, set the log level to `error`, enable filesystem storage, set the storage path, define buffer synchronization as `normal`, enable checksum verification, set the memory limit for the backlog, and enable storage metrics.

The `Log_Level` setting in the `[SERVICE]` section of the Fluentbit configuration file determines the verbosity of log messages.

While the example above sets it to `error`, it can also be configured to `debug` for more detailed logging. By setting `Log_Level` to `debug`, Fluent Bit will provide additional log messages that can aid in debugging and troubleshooting any potential issues or errors.

However, it’s important to note that enabling debug logging may generate a higher volume of log output, which could impact system performance and should be used judiciously for production environments.

3. Add the input configuration section:

				
					[INPUT]
    Name              tail
    Path              /var/log/*.log
    Path_Key          filename
    Tag               Apica
    Buffer_Max_Size   1024k
    Read_from_Head    On
    Mem_Buf_Limit     1MB
    Refresh_Interval  5
    Storage.type      filesystem

This configuration sets the input plugin as `tail`, specifies the path of the log files to monitor (in this case, all files matching the `*.log` pattern in the `/var/log/` directory), uses the filename as a key for each log record, assigns the tag as `Apica`, sets the maximum buffer size, enables reading logs from the beginning of the file, sets the memory buffer limit, specifies the refresh interval, and sets the storage type as `filesystem`.

4. Add the filter configuration sections:

				
					[FILTER]
    Name               record_modifier
    Match              Apica
    Record cluster_id  flash

[FILTER]
    Name             record_modifier
    Match            Apica
    Record namespace  xyz

[FILTER]
    Name            record_modifier
    Match           Apica
    Record app_name system_logs

[FILTER]
    Name            throttle
    Match           *
    Rate            700
    Window          300
    Interval        1s

These filters modify log records for the `Apica` tag. The `record_modifier` plugin is utilized to enhance log records by adding the `cluster_id`, `namespace`, and `app_name` fields.

This modification provides additional context and facilitates better log organization. On the other hand, the `throttle` plugin plays a crucial role in controlling log ingestion rates.

By limiting the log rate to 700 logs per second within a 300-second window, the throttle plugin ensures smooth ingestion by buffering the logs. This prevents sudden spikes in log ingestion, stabilizes the system, and helps maintain a balanced and controlled flow of log data.

It also mitigates the risk of overwhelming downstream systems, optimizes resource utilization, and improves overall log processing efficiency.

Furthermore, the throttle plugin aids in maintaining a consistent and predictable log ingestion rate, making it easier to manage and analyze log data effectively.

5. Add the output configuration section:

				
					[OUTPUT]
    Name          http
    Match         *
    Host          lq5955.apica.io
    Port          80
    URI           /v1/json_batch
    Format        json
    tls           off
    tls.verify    off
    net.keepalive off
    compress      gzip
    Header Authorization Bearer ${Apica_TOKEN}

This configuration specifies the output plugin as `HTTP`, sets the matching condition for all logs, defines the host as `lq5955.apica.io`, uses port `80`, specifies the URI `/v1/json_batch` for the HTTP endpoint, sets the log format as JSON disables TLS, disables TLS verification, disables TCP keepalive, enables gzip compression, and includes an Authorization header with a token provided by the `${Apica_TOKEN}` environment variable.

6. Save and close the configuration file.

Ensure that you have the necessary permissions and access to the log files specified in the `Path` configuration.

Verify that the HTTP endpoint, including the host, port, and URI, is correct and that the required credentials or tokens are accurately provided. Once you’ve completed these steps, start or restart Fluentbit with the updated configuration file, and it will start sending log events to the specified HTTP endpoint, using the configured plugins and settings for storage persistence.

Fluentbit Limitations

Fluentbit, like any other log processing tool, has certain limitations with its default settings that can result in data loss and imbalances in data input and output.

However, these limitations can be mitigated by using storage persistence mechanisms to ensure that data is not lost.

The following are some limitations that stick out:

Data Loss:

By default, Fluentbit prioritizes performance over durability, which can result in potential data loss during high-volume scenarios.
To mitigate data loss, enable storage persistence to store logs in a reliable and persistent manner, minimizing the risk of losing log data.

In/Out Imbalances:

In certain situations, the input rate of logs may exceed the output rate, causing a backlog of logs and potential data loss.
Configure storage persistence to provide a buffer for logs, allowing Fluent Bit to handle fluctuations in log input and output rates more effectively.

Storage Persistence:

Enabling storage persistence ensures that log data is safely stored on disk, even in the event of system failures or unexpected shutdowns.
By persisting logs to disk, Fluent Bit can recover and resume processing logs from where they left off, reducing the chances of data loss.

Configuring Storage:

Define the storage type (e.g., filesystem) and specify the storage path where logs will be stored.
Adjust the storage settings, such as the memory limit, to accommodate the volume of logs and the available resources.

Monitoring and Capacity Planning:

Regularly monitor Fluent Bit’s performance and resource usage, especially when handling large amounts of log data.
Plan and allocate sufficient storage space to avoid running out of disk space, which could lead to data loss.

By understanding and addressing these limitations through the use of storage persistence and careful configuration, you can enhance the reliability and durability of Fluent Bit, ensuring that log data is not lost and maintaining data integrity in log processing workflows.

What’s Ahead?

We hope this post has been useful in helping you understand the potential limitations of using Fluentbit to process log data, along with how to use it for production log gathering.

For more information on Fluentbit and its capabilities, check out our documentation and contact us if you have any questions. We’d love to hear from you!

In the next few posts, we’ll explore how you can use Fluent Bit to ship Windows logs to Apica, how to forward Amazon-Linux logs to Apica using Fluent Bit, and finally, how to install Fluent Bit on Ubuntu.

Until then, happy logging!

Quick Roundup

What is Fluentbit?
A lightweight log processor and forwarder built in C—ideal for high-performance, low-resource environments like Kubernetes.
Why use Fluent Bit in production?
Uses less CPU/memory than Logstash or Fluentd, supports structured/unstructured logs, and integrates with tools like S3, Elasticsearch, and Loki.
How does it work?
Follows a clear pipeline: Input → Parser → Filter → Buffer → Router → Output, with plugins for each stage.
How to configure it?
Use a .conf file to define log sources, optional filters, persistent storage, and output destinations. Can run as a daemon or sidecar.
Any limitations?
Defaults favor speed over durability—so enable file-based buffering and throttling to prevent data loss or output bottlenecks.

INTEGRATIONS

Apica Product Overview

Resources

About Us

Get Started Free