Understanding the Factors that Impact Nuxeo Platform Performance
Managing the sizing and performance of any Enterprise Content Management (ECM) application is a complex task, as each application has unique requirements and numerous factors to consider. Fortunately, the Nuxeo Platform is designed with performance optimization in mind, and the Nuxeo team has implemented continuous performance testing as part of their quality assurance process.
The Nuxeo Platform’s performance is measured by metrics focused on user experience, such as application response time. This commitment to ongoing, measured improvement ensures the platform delivers rapid response times, even under heavy loads with thousands of concurrent users accessing a repository containing millions of documents.
To effectively track and optimize the performance of the Nuxeo Platform, it’s essential to identify the key factors that impact performance, as well as those that do not. Let’s explore some of the most critical performance considerations:
Access Checks and Security Policies
One of the most significant factors that can impact Nuxeo Platform performance is the “Access Check” process. In a typical ECM system, users can only view or modify documents they are authorized to access. The Nuxeo Platform’s default security policy uses Access Control Lists (ACLs) to manage these permissions.
Depending on the target use case, the number of ACLs can vary significantly – from very few (when ACLs are defined only on top-level containers) to a large number (when they are defined on almost every document). To handle both scenarios effectively, the Nuxeo Platform provides several optimizations for processing ACLs, such as pre-computing ACL inheritance.
The Nuxeo Platform also allows for the definition of custom security policies based on business rules, which can be converted into queries for faster execution on large document repositories.
Presentation Layer Optimization
The presentation layer is often a bottleneck in ECM web applications, as mistakes in the display logic (e.g., adding costly tests, fetching too much data) can significantly slow down the application, particularly when using JSF (JavaServer Faces) technology.
The good news is that Nuxeo’s default templates are well-tested for performance. However, when modifying Nuxeo’s templates or adding custom ones, web developers must be mindful of potential performance issues and optimize the display logic accordingly.
Document Type Configuration
Defining custom Document Types in the Nuxeo Platform typically has little to no impact on performance. However, if you define documents with a large number of metadata elements (e.g., several hundred) or complex schemas (such as nested complex types on multiple levels), this can affect performance in the following ways:
- Increased memory usage and processing time for document queries and updates
- Potential issues with the prefetch settings, which may need to be adjusted for your specific use case
To configure the prefetch settings, you can use the TypeService
configuration extension point in the Nuxeo Platform.
Repository Size and Usage Patterns
As expected, the number of documents in the Nuxeo Platform’s repository can impact performance. However, the Nuxeo Platform has been successfully tested with repositories containing several million documents on a single server, demonstrating its scalability.
It’s important to note that the performance of the Nuxeo Platform is more closely tied to the number of concurrent requests (transactions or requests per second) than the number of users. This means that 10 highly active users can potentially load the platform more than 100 inactive users.
Binary File Handling
The actual size of the binary files stored in the Nuxeo Platform’s repository does not directly impact the performance of the repository itself. Since the binary files are stored in a Binary Store on the file system, rather than in the database, the impact is limited to disk I/O and upload/download time.
The only factor related to binary file size that can impact performance is the size of the full-text content, as it affects the size of the full-text index. However, in most cases, large files (e.g., images, videos, archives) do not have a significant full-text content.
Folder Structure and Hierarchy
When using the Nuxeo Platform’s VCS (Version Control System) repository, the number of documents within a folder has no impact on performance. You can have folders with thousands of child documents without any performance degradation.
The key consideration when designing your filing plan is the impact of the folder hierarchy on security management and ACL inheritance, rather than performance.
Technical Factors that Influence Nuxeo Platform Performance
Apart from the use-case-specific factors, there are some technical considerations that can impact the performance of the Nuxeo Platform:
Application Server Choice
The Nuxeo Platform is available on both Tomcat and JBoss application servers. Generally, Tomcat tends to have better raw performance than JBoss. Additionally, the configuration of the Tomcat HTTP and AJP (Apache JServ Protocol) connectors can significantly impact the server’s behavior under load, and the maximum thread value should be set to prevent the server from being overloaded.
Under heavy load, the JBoss JTA (Java Transaction API) object store can generate a large number of write operations, even for read-only access. A simple workaround is to use a RAMDisk for the server/default/data/tx-object-store
folder.
Database Selection and Tuning
The choice of database has a significant impact on Nuxeo Platform performance. The Nuxeo Platform supports both NoSQL (Document-Based Storage, or DBS) and relational (Version Control System, or VCS) repository implementations.
The most performant option is the NoSQL interface with MongoDB implementation. If you choose to use a relational database, the Nuxeo team recommends PostgreSQL, as it comes with the most optimizations.
Regardless of the database choice, proper tuning is essential, as the Nuxeo Platform does not provide default database configurations for production environments.
Network Considerations
The network connection between the Nuxeo Platform application and the database can also impact performance, especially on pages that manipulate many documents and generate numerous JDBC (Java Database Connectivity) round trips.
The Nuxeo team recommends using a Gigabit Ethernet connection and ensuring that any routers, firewalls, or Intrusion Detection Systems (IDS) do not introduce unnecessary latency. You can use tools like ping
and ethtool
to test the network configuration and measure the latency added to each JDBC round trip.
Nuxeo Platform Performance Tracking and Optimization
At Nuxeo, the team takes a comprehensive approach to managing the performance of the Nuxeo Platform and its modules. They provide several tools for load testing and benchmarking the platform, including the use of Gatling for performance testing and the Nuxeo Platform Importer addon for populating the document repository.
To ensure ongoing performance monitoring and detection of regressions, Nuxeo has implemented a system of small benchmark tests that are automatically run every night as part of their Continuous Integration (CI) chain. These fast benchmarks help quickly identify and address any performance issues introduced by changes in the user interface, document types, or other components.
Additionally, Nuxeo conducts major benchmarking campaigns every 2-3 months to test the platform’s limits and identify opportunities for further database and Java optimizations.
To correctly size and configure a Nuxeo Platform-based ECM application, Nuxeo recommends the following approach:
-
Define your performance requirements and hypotheses: Carefully consider all the factors that can impact the platform’s performance, such as the number of users, document volumes, security policies, and customizations.
-
Implement performance testing from the start: Performance benchmarking should not be an afterthought. It’s more efficient and cost-effective to set up performance tests from the beginning, starting with simple benchmark tests and improving them as your customizations evolve.
-
Leverage Nuxeo’s standard benchmarks and tools: Nuxeo provides a set of standard benchmarks for both small and large document repositories, which you can use as a starting point for your own tests. Additionally, Nuxeo’s platform tools, such as the Nuxeo Platform Importer, can help you populate your document repository for more realistic testing.
-
Continuously monitor and optimize performance: Implement a comprehensive performance monitoring and optimization strategy, using tools like Graphite and Nuxeo’s own performance monitoring capabilities, to identify and address any performance bottlenecks.
By following this approach and leveraging the performance optimization features and tools provided by Nuxeo, you can ensure that your Nuxeo Platform-based ECM application delivers exceptional performance, even at scale.
Demonstrating Nuxeo Platform Performance at Scale
To showcase the Nuxeo Platform’s ability to handle large document volumes, the Nuxeo team has conducted several performance benchmarks over the years, consistently demonstrating the platform’s scalability and optimization capabilities.
Benchmark 1: 10 Million Documents (2010)
In 2010, a benchmark was conducted against Nuxeo 5.3.1, which demonstrated the platform’s ability to handle a repository of 10 million documents without performance issues. This benchmark enabled the ACL optimization (NXP-4524) and achieved the following average times:
- Document retrieval: 150-300 ms
- Document insertion: 500-1000 ms
The benchmark showed no signs of performance degradation, even with the large data volume, once the data was loaded from disk.
Benchmark 2: 100 Million Documents (2019)
Building on the success of the 2010 benchmark, the Nuxeo team conducted a more recent test in 2019, successfully injecting 100 million documents into a single Nuxeo document repository at a constant throughput.
To handle this massive volume, the Nuxeo team leveraged the platform’s multi-repository feature and Elasticsearch integration to shard 1 billion documents across ten repositories with a single index.
Benchmark 3: Scaling to 6,000 Requests per Second (2020)
In the most recent benchmark, the Nuxeo team focused on validating the platform’s ability to handle a large number of API calls per second, including complex searches. By utilizing Elasticsearch and configuring the search and fetch processes on a single node, the team achieved impressive results:
- 3,000 requests/second on a single node
- 6,000 requests/second on a cluster of three nodes
These benchmarks demonstrate the Nuxeo Platform’s exceptional performance and scalability, even with massive document volumes and high-throughput requirements. The Nuxeo team continues to push the limits of the platform, ensuring it can meet the demands of the most challenging enterprise content management scenarios.
To learn more about the Nuxeo Platform and its performance capabilities, visit the IT Fix blog or explore the Nuxeo documentation at doc.nuxeo.com.
Conclusion
Tracking and optimizing the performance of the Nuxeo Platform is a crucial aspect of ensuring the success of any enterprise content management solution. By understanding the key factors that impact performance, leveraging the platform’s optimization features, and conducting comprehensive benchmarking and monitoring, organizations can unlock the full potential of the Nuxeo Platform and deliver exceptional experiences for their users.
The Nuxeo team’s commitment to continuous performance improvement and the platform’s robust scalability capabilities make it an excellent choice for enterprises looking to manage their content effectively, even at massive scales. By adopting the Nuxeo Platform and following the best practices outlined in this article, organizations can optimize their ECM operations, drive efficiency, and stay ahead of the competition in today’s dynamic digital landscape.