Hi, during the load testing of our AEM-based application we have faced the following issue:
Under the heavy load (approx. 1500 users) system becomes unresponsive, due to locks, created by com.day.crx.core.data.ClusterDataStore, here is a crop of our thread dump to illustrate the issue:
"192.168.50.155 [1404323964770] GET /content/app/page.html HTTP/1.0" Id=621 BLOCKED on com.day.crx.core.data.ClusterDataStore@4ed5da53 owned by "192.168.50.155 [1404323964427] GET /content/app/page.html HTTP/1.0" Id=488 at com.day.crx.core.data.ClusterDataStore.getRecordIfStored(ClusterDataStore.java:236) - blocked on com.day.crx.core.data.ClusterDataStore@4ed5da53 at org.apache.jackrabbit.core.data.AbstractDataStore.getRecord(AbstractDataStore.java:42) at org.apache.jackrabbit.core.value.BLOBInDataStore.getDataRecord(BLOBInDataStore.java:151) at org.apache.jackrabbit.core.value.BLOBInDataStore.getSize(BLOBInDataStore.java:96) at org.apache.jackrabbit.core.value.InternalValue.getLength(InternalValue.java:654) at org.apache.jackrabbit.core.PropertyImpl.getLength(PropertyImpl.java:237) at org.apache.jackrabbit.core.PropertyImpl.getLength(PropertyImpl.java:835) at org.apache.sling.jcr.resource.internal.helper.jcr.JcrNodeResource.setMetaData(JcrNodeResource.java:307) at org.apache.sling.jcr.resource.internal.helper.jcr.JcrNodeResource.(JcrNodeResource.java:97) at org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.createResource(JcrResourceProvider.java:191) at org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.getResource(JcrResourceProvider.java:130) at org.apache.sling.resourceresolver.impl.tree.ResourceProviderFactoryHandler.getResource(ResourceProviderFactoryHandler.java:107) at org.apache.sling.resourceresolver.impl.tree.ResourceProviderEntry.getResourceFromProviders(ResourceProviderEntry.java:382) at org.apache.sling.resourceresolver.impl.tree.ResourceProviderEntry.getInternalResource(ResourceProviderEntry.java:345) at org.apache.sling.resourceresolver.impl.tree.ResourceProviderEntry.getResource(ResourceProviderEntry.java:131) ... (for some i can failed to attach the entire stacktrace)
So, we'd like to ask two question regarding this problem:
1. Is it a known issue? Where can we find any recomendations on how to increase the throughput of the AEM application?
2. What is the maximum supported load for single AEM instance? Possibly this metrics are available somewhere in the documentation?