Saturday, December 24, 2011

Holiday Readiness | Performance tips for WCS sites?

1. One of the important thing to start with, DB clean jobs for guest users\guest orders\CTXMGMT\CTXDATA, need to setup and monitored regularly.
2. Expired promotions should be monitored and deactivated.--This was a problem in previous versions
3. The number of times OrderCalculateCmd is called should be minimal. This is a very heavy command, as it calls calculation engine for tax\shipping\promotions. One of the sites, I worked on this was called multiple times during merge cart, cart, checkout. It is a very heavy command. If the number of items in the cart remains same and promotions are not added at the run time, should try to avoid a call to this.
4. Junk cart should be assigned to a custom user.This is to safeguard against any valid scenario invoking JunkCart on a live site.
5. Dyna cache enhances greatly the performance and should be used as it reduces the CPU usage greatly. One could also create CacacheableControllerCommands to taking Dyna to the next level and I have used this and it helps improve performance a lot.
6. Monitor production errors and calls made on web server access logs and fixing any errors, would help enhance performance. Reducing errors helps reduce load on application server.
7. Pricing is one of the heavy calls in WCS and we have built a pricing registry to enhance the pricing performance problem, it helped greatly.
8. On a couple of sites, I worked on Ajax calls were used heavily and most of our AJAX calls are Dyna cached and that helps heavily.Also sections of the page such as cart page where calls to bring up accessories or other merchandising associations should be converted to Ajax calls and it wouldn't block the main functionality of the page such as cart.
9. DB performance, making sure the stats are gathered correctly and finding out the top 10 performance intensive queries and run maximum number of times and making sure for all these queries, the indexes are correctly defined and tune the queries to make sure they efficient (for oracle use Explain plan).
10. Make sure all custom tables have opt counter triggers other wise this could cause race conditions.
11. A lot of performance is achieved from Edge Caching. We use Akamai where all the product and category pages are completely caches. Edge servers are refreshed every night after stageprop. (This is not required if new content is not refreshed every night).
12. Number of items in cart limit, there are some bots that try to add 100's of products to cart and that it self doesn't break but a combination of that and if there are automatic promotions could cause instability. I would advise a configurable parameter for cart limit and writing custom code for this. If you do not want to do a code fix, also you can do


ALTER TABLE SCHEMA_NAME.ORDERITEMS ADD
CONSTRAINT check_orderitems_1
CHECK (quantity <= 100) ENABLE
VALIDATE

The constraint will bring the user back to home page on trying to add to cart.

13. Extra objects from memory can be controlled specially while using beans in JSP's by making sure the scope is correctly defined
e.g. Specially this starts to matter for Order tunnel pages as there are multiple imports to different JSP's. It helps by giving the scope to request.

14. Reduce the guest user creation in the system by making sure, the isGeneric returns true, if a command needs to be used for generic users. We found multiple commands when turining the isGeneric to true reduce the creation of thousands of guest users. By default isGeneric is false.

15. If you are not using OOB payments, there are some commands that still could call and populate these tables on top of disabling from wc-server.xml , the following update is required.

INSERT INTO CMDREG(STOREENT_ID, INTERFACENAME, CLASSNAME, TARGET) VALUES (0,'com.ibm.commerce.edp.commands.PrimePaymentCmd','com.ibm.commerce.edp.commands.PrimePaymentVoidCmdImpl','Local'); INSERT INTO CMDREG(STOREENT_ID, INTERFACENAME, CLASSNAME, TARGET) VALUES (0,'com.ibm.commerce.edp.commands.ReservePaymentCmd','com.ibm.commerce.edp.commands.ReservePaymentVoidCmdImpl','Local'); INSERT INTO CMDREG(STOREENT_ID, INTERFACENAME, CLASSNAME, TARGET) VALUES (0,'com.ibm.commerce.edp.commands.FinalizePaymentCmd','com.ibm.commerce.edp.commands.FinalizePaymentVoidCmdImpl','Local'); INSERT INTO CMDREG(STOREENT_ID, INTERFACENAME, CLASSNAME, TARGET) VALUES (0,'com.ibm.commerce.edp.commands.TriggerPaymentActionsCmd','com.ibm.commerce.edp.commands.TriggerPaymentActionsVoidCmdImpl','Local'); INSERT INTO CMDREG(STOREENT_ID, INTERFACENAME, CLASSNAME, TARGET) VALUES (0,'com.ibm.commerce.edp.commands.CancelOrderCmd','com.ibm.commerce.edp.commands.CancelOrderVoidCmdImpl','Local'); INSERT INTO CMDREG(STOREENT_ID, INTERFACENAME, CLASSNAME, TARGET) VALUES (0,'com.ibm.commerce.edp.commands.StoreAndValidatePaymentCmd','com.ibm.commerce.edp.commands.StoreAndValidatePaymentVoidCmdImpl','Local'); INSERT INTO CMDREG(STOREENT_ID, INTERFACENAME, CLASSNAME, TARGET) VALUES (0,'com.ibm.commerce.edp.commands.PIAddCmd','com.ibm.commerce.edp.commands.PIAddVoidCmdImpl','Local');

16. Multiple Invalid cookie errors were caused due to contention as we have multiple AJAX calls in pages and due to contention. Implemented a CustomFilter and convert requests into stateless so that commerce
can process as generic user requests.
// Check for stateless URL
if (uri != null && urlsToFilter != null && urlsToFilter.indexOf(uri) >= 0) {
request = new CustomHttpServletRequestWrapper(request, response);
response = new CustomHttpServletResponseWrapper(response);
}

chain.doFilter(request, response);

LoadRunner | Holiday Readyness

Most large B2C enterprises perform, holiday readiness to make sure the site remains stable and can handle the holiday web traffic smoothly. As a part of the exercise multiple load test runs are performed  and usually  monitored in a performance environment with production data and any bottle necks and performance issues found during this are fixed.

Important Analysis pointers:
Java Garbage collection: If there are increases in response time in bursts. it can be attributed to Garbage collection.
The graphs for hits per second should follow a similar pattern to Throughput otherwise there is something in
Correlating failed transactions with errors.

Monitor CPU/IO/Memory during load tests: It is a good idea to run this every 30 seconds for the duration of the load tests. This is for App server and DB server.

vmstat - reports information about processes, memory, paging, block IO, traps, and cpu activity
iostat - Used for reporting CPU statistics and input/output statistics for devices, partitions and network filesystems (NFS)
Load Runner has 3 main components:
VU Gen (Virtual User Generator)
--Load generators are installed on multiple machines to spread the load for generating the traffic.
--Depending on the traffic required, this could change but usually 3-5
Load Runner Controller (Console)
--This is the main terminal that can be used to connect to consoles , Load scripts and run load runner tests.
Analysis
--This is used to view multiple charts and graphs for analysis data points from tests run.

Some Key Terminology when using Load Runner:
Virtual User & Distribution of virtual users: Virtual users are used to emulate the behavior of real users.
Transaction Mix:  This is based on business scenarios. Usually this is divided among-st various scripts based on traffic patterns from web analytics (Omniture\Google\CoreMetrics, etc any of these)
Ramp up\Ram down:  The load on the performance site is gradually increased and gradually decreased. Gradually increasing helps with cached pages for accurate test results.
Load is usually gradually increased and usually a 2-3  interval is provided to add virtual users.
Thinking Time: This is the time between 2 actions. In trying to emulate human behavior, it is the time, human takes to think between page clicks.
Total Duration of test = RampUp+Duration of load test+RampDown

Key Analysis parameters:
Response Time per Transaction: Time taken for the application to complete a transaction or a business scenario. This time includes Network Transmission+ Network Latency+Web Server processing time+App Server Processing time+ DB server Processing time.
This is very helpful parameter to determine during peak loads, who the system would really behave
Hits/Second: Number of requests hitting webs server per second. This is helpful in conjunction with transaction time to see, how the performance of the site behaves as the number of hits/sec increases.

Through Put: Is the amount of data in kilobytes received by a user per second

HTTP Response Counts: Most valid responses are 200, 303 (redirection) and It is a good idea to keep the tab on 403 (forbidden), 404 (page not found), 500 (Internal server error)

Error count: Keeps track of error counts, this is a good measure to make sure there are no application server errors, This is really helpful, if the correct build and database changes are not present in performance.