You cannot improve that you cannot measure
You cannot improve that you cannot measure
-anonymous
Isn’t it so true. What’s the point of investing in scaling up, scaling out, adding caching, accelerators, etc, when you don’t know what you are getting in return. Before doing any performance tuning, it is important to measure the current performance and benchmark the improvements. It will also boost your ego
Jmeter from Apache is probably one of the best open source tools to load the servers and measure response times. There are other tools for performance testing like YSlow which I will not cover here since it cannot do load testing. It will be a separate post. But do yourself a favor and at least install YSlow in Firefox. It’s a drop dead simple tool which gives you the most crucial information. Of course you could also pay me top $$ for the same information.
1. JMeter: If you are anything like most geeks, you have already clicked on the link above and are well on your way to discover the tool yourself. If you are still with me, which I hope, the video below will show you how to set up a simple test. Here is the written summary though:
1. First make sure you have java installed. (not covered in this blog)
2. Download Jmeter zip and unzip it in a directory you would like to install it in - download JMeter.
3. Run jmeter.bat (windows) or jmeter in (X)nix install.
4. Once you see the Jmeter user interface, set up a proxy on your local machine, to route all internet traffic through it (figure 1)
Figure 1 - Proxy setup
5. Create a new thread group, which will have your tests in it (figure 2)
Figure 2 - Threadgroup
5.1 You can define the number of users(threads), iterations(loops), and time between each new thread. This will represent the load.
5.2 Add the HTTP Cookie manager and HTTP cache manager (figure 3). Without Cookie manager JMeter cannot handle the cookies from the server and the tests will not run.
Figure 3 - Add managers
6. Add a new HTTP Request from the sampler (figure 4). Add name of the server in Server Name or IP. This is the web server again which the test will run.
Figure 4 - Add http request
7. Also add listeners which will record the results of the test (figure 5)
Figure 5 - dd listeners
8. Run your test from Run->Start
8. Enjoy the results.
Tip #8: How to save test results
You can choose the format of results file (xml or csv) and the fields to be saved clicking on Configure button and checking necessary options. As for me I prefer xml format and default options except Save URL which is useful when I need to know exact URLs of HTTP requests.
Sessions at DrupalCon
I am very excited about DrupalCon 2010 at SFO. We have proposed the following sessions:
Spreadsheet integration with Drupal – In depth
Business Essentials: Print, Excel and Calendar integration with Drupal
Hands on session: DDBlock, Carousel, Popups
Please vote for the sessions so we can present at DrupalCon. You have to first create a login at drupal.org to be able to vote.
Tip #7: How to add cookie support to your Test Plan
Be sure that Cookie Policy option is set to "compatibility" value, it will work in most cases. As for "Clear cookies each iteration?" checkbox I always check it and never had the situation when I was need it to be unchecked.
You don’t have to be Superman to get better Drupal performance
Caching is one of the common ways of improving the performance of a website. Caching aims to reduce the number of trips made to the database by storing the snapshot of the results in a location (like database or file structure or memory) from where it can be retrieved faster the next time. Caching works best for information that do not change often and/or frequently consumed and/or expensive to process. Periodic maintenance need be done on the cached information so that the website users only get the latest information and not ’stale’ information. During development one of the most common frustrations is not seeing the latest changes that have been made, because the webpage information is retrieved from the cache that has old information.
Drupal’s File-based Cache:
Drupal provides a way to consolidate all the css and javascript files into fewer files. This is very useful for pages where many javascript and css files are used to render. Instead of having many roundtrips downloading each of those files, a consolidated file would reduce the roundtrips and can decrease the page loading time significantly.
The setting for optimizing the css and javascript files can be found at Administer->Site Configuration->Performance. If “Optimize CSS files” is enabled the css files are compressed by removing the whitespace and line breaks and stored in the “css” directory within the “files” directory as set on the file system settings page. If “Optimize JavaScript files” is enabled the javascript files (without compression) are stored in the “js” directory within the “files” directory. The “Download method” in the file system settings page has to be set as “Public” for these options to be available.
It is better to turn on these options only in the production environment as it can interfere with the theme and module development. Also if you are running a load-balancer along with two or more servers please make sure that the cached javascript and css files are available and identical on all the servers.
Drupal’s Database-based Cache:
Drupal comes with a nice cache mechanism that is used by the Drupal core. Drupal exposes this as an api called Cache api that can be used by developers to add caching to their own modules.
The Cache api by default stores the information to be cached in a table called ‘cache’. If every module uses the same table for caching, the table can grow exponentially leading to increase in overhead rather than reducing it. Hence it is better to have several cache tables to store your information. If needed module developers can add their own cache table, although it should be identical in structure to the default cache table. It is also a good practice to prepend cache_ to your own cache table.
Drupal core comes with seven cache tables.
- cache: An “all purpose” table that can be used to store a few rows of cached data. Drupal uses this table to store the following:
- Variable data: The table “variable” stores most of the administrative settings. These settings are added to a PHP array(), serialized and stored as a single row in the cache table. This way the settings can be retrieved quickly and avoids making multiple database queries. All variables that uses variable_get() and variable_set() are cached in the cache table.
- Theme registry data: Each of the themes registries are cached in the cache table.
- Schema data: Information about the structure of all the tables in the database is cached in the cache table.
- 2. cache_block:
- Content generated by the blocks are cached in this table. This reduces Drupal from querying the database repeatedly to get the block contents. The block developer can choose based on the content displayed in the block, whether the block can be cached or not. If the developer decides to cache the block content, he/she has four ways of caching the block content.
-
-
- Cache block content separately for each available role combination
- Cache block content separately for each user
- Cache block content separately for each page
- Cache block content once for all the users
-
- Drupal, in addition to above caches block content separately for each theme and for each language supported.
- Block caching can be changed at Administer->Site Configuration->Performance. Also if any of the content access modules that restrict access to different kinds of content are enabled, then block caching is automatically disabled.
3. cache_filter:
- Filters are used to filter node content to make it more secure and consumable. This is a very expensive operation to perform every time a node is viewed. Filters are applied to the node content when they are saved or created, and the results are cached in the cache_filter table.
4. cache_form:
- The form API uses this table to cache the generated form. This saves Drupal from regenerating the unchanged forms.
- 5. cache_menu:
- The menu system caches all created menus along with its hierarchies and path information in this table. This saves Drupal from having to regenerate the menu information for each page load. Menus are cached separately for each user and for each language.
- 6. cache_update:
- Information about all the installed modules and themes are cached in this table.
- 7. cache_page:
- The page caching is one of the important optimization that can truly improve the performance of a heavily used website. Entire page views are cached in the cache_page table. Drupal’s full page caching only caches anonymous user pages. It does not cache authenticated user pages because they are often customized for the user and so they are less effective. This saves Drupal from making expensive calls to generate the page repeatedly, instead the cached page content can be retrieved in a single query. Administer->Site Configuration->Performance has several settings that affect the page caching.
- Caching mode:
- Disabled: This disables the page caching although other types of caching would still continue.
- Normal: This mode offers a substantial improvement over the “Disabled” mode. Drupal’s bootstrapping is designed such a way that, only the minimum required amount of code and queries are executed to render a page from the cache. Even with this minimum code, the database system is initialized, access and sessions are initialized, module system is initialized and hooks – hook_boot and hook_exit are called on the modules that implemented them. Only then the cached page is rendered.
- Aggressive: This mode skips the initialization of the module system and so hook functions are never called. This shaves valuable time each time an anonymous user requests a page. But this means that the modules that have implemented the hooks may not work properly. Drupal warns (see image) by listing all the modules that may not function properly if the Aggressive mode is enabled.
- Minimum cache lifetime:
- This sets the minimum lifetime of a cached content. If a user changes content, other users would have wait until this expires, to see the changed content. If it is set to “none”, then there is no wait, and all users can see the latest content immediately.
- Page compression:
- If enabled the page contents would be compressed before caching it. This saves bandwidth and improves the download times. However if the webserver already performs the compression, then this should be disabled. In Apache server you can use module mod_deflate to turn on the compression. IMHO it is better to use this functionality (may be enhance it with modules like css_gzip or javascript_aggregator) rather than mod_deflate because the latter do not have any caching.
Note: Even if the Page cache and/or Block cache are disabled, other types of caching like menus, filters, etc. would still continue to happen and they cannot be disabled.
Clear cached data:
In the performance page at Administer->Site Configuration->Performance, you can use the “Clear cached data” to clear ALL of the caches in the system including the css and javascript optimizations. If you have your own cache table, you can use the hook_flush_caches() to clear the cache when the “Clear cached data” is executed.
Pluggable Cache:
Drupal provides a way to plug in a customized caching solution such as memory or file-based or hybrid (memory and database for fail-safe) caching.
There are 2 levels of plugging:
1. Solutions that uses the Drupal cache API, but stores information in a customized manner (like memory) instead of the Drupal ‘cache*’ tables.
- Example: memcache module
- 2. Solutions that provide their own cache API implementation, together with customized storage of information. Essentially complete Drupal cache system is bypassed. This is called the ‘fastpath’ mode (or is it page_fast_cache?). When a cached page is rendered using this technique most of the Drupal bootstrap technique are skipped, and hence the Drupal statistics would not be updated. So for statistics like page views, Drupal statistics may not be accurate.
- Example: cacherouter module
Drupal’s modules:
There are many contributed drupal modules that extend the Drupal cache, or provides integration with an external caching solution. Some of them even work for some parts of the authenticated user pages. It is certainly possible to combine many contributed modules to get better performance.
Here are some of them with a very brief summary (in no particular order):
Memcache: This module lets you use a memcache server to do the caching. The nice thing is it provides 2 types of caching, one memory only, and another memory or database. The latter approach certainly has lesser performance improvement than the first, but it gives a failsafe mechanism where if the memcache is not available, it would use the database for caching.
Authcache: This module can cache authenticated user pages if the pages are same for a particular user role. This module can be combined with Memcache or Cacherouter to have a customized caching.
Advcache: This module extends the caching to areas that the Drupal core does not cover. The main advantage is for authenticated non admin users that have a single role. This module can be combined with Memcache or Cacherouter to have a customized caching.
Cacheexclude: You can use this to exclude certain anonymous user pages from being cached.
Cacherouter: This module allows users to have different caching technology for different cache tables. It has native support to many caching technologies like APC, Memcache, XCache, even database and file system.
Boost: Another external caching mechanism for mostly anonymous users.
Cache: Similar to Cacherouter, provide mechanism to use different caching technologies.
References:
Pro Drupal Development, Second Edition by John K. VanDyk
Caching Modules comparison: http://groups.drupal.org/node/21897
Drupal caching, speed and performance: http://drupal.org/node/326504
Drupal guide to caching: http://lists.drupal.org/pipermail/documentation/2008-March/005949.html
Tip #6: How to do the Test Plan more flexible using variables and properties
As you can see Test Plan can be executed with default values and at the same time if you need to modify some options you can pass them to Test Plan through the command line without the modification of the Test Plan.
Scalability 101
Scalability can be a confusing topic, because it is usually not defined in easy terms. If I were to characterize scalable system,
- The system should be able to accommodate increase in data
- The system should be able to accommodate increase in usage
- As the load increases on the system, the system still remains relatively accessible and maintainable.
It can be easy to confuse scalability with performance and these are two separate characteristics. A high performing system can quickly become non performing, if it cannot scale (the reverse is usually not true though).
As the load increases on a system, we still want it to keep responding with a good (low) response time. This usually means that the hardware is using more resources to serve the request. How the hardware is provisioned depends on the landscape architecture. We can choose to either scale the hardware vertically (scale up) or horizontally (scale out). These are very different approaches but in a nutshell:
Vertical Scaling (scale up): Scaling is achieved by adding more hardware resources to an existing physical machine. Example, would be allocating more memory, adding more memory, more hard disk, additional CPUs, etc. When a hardware resource starts to run at capacity a bigger box is added to the mix. This hardware upgrade can continue to happen till a limit is reached. Therefore there is a physical limit to vertical scaling.
Horizontal Scaling (Scale out): We also add hardware to scale horizontally, except, horizontal scaling is achieved by adding machines in parallel to an existing machine. We can buy mid range machines and keep adding a new one as each runs out of resource. But of course the scalability will rarely improve proportionally and the TCO will also increase. There is networking supplies and setups required for each new machine, in addition to rack space, etc.
Moreover, some of the resources might be very underutilized in horizontal scaling. For example, in a typical web application, the network I/O and memory might be bottlenecks. As we add more machines, we are also adding CPUs and hard disk, which are now underutilized.
As you can see, capacity planning for a server setup can become a thorny problem very soon and requires a systematic approach to design and scale the landscape, depending on the application.
Another way to understand the issue can be a look at Amdhal’s law. This model explains what should be performance improvement we can expect by adding resources in parallel.
P: proportion that is affected by the computation.
S: Speedup
A simple example: if an improvement can speed up 30% of the computation, P will be 0.3; if the improvement makes the portion affected twice as fast, S will be 2. The overall speedup will be: 1/(0.7 + 0.35) = 1/1.05 = 0.95
Who is responsible for security in the Cloud?
Three are at least three categories of service providers in the cloud :
1. IaaS – Infrastructure as a Service (e.g. GoGrid, Amazon EC2)
2. PaaS – Platform as a Service (e.g. Force.com, Google App Engine)
3. SaaS – Software as a Service (e.g. Salesforce, SAP Business By design, …)
The details of these different providers is not in the scope of this post. I will write about how to manage security on these platforms and who is responsible for which part of the security. there are two main parties involved – service providers and customers.
1. IaaS: The providers treat the applications deployed as black boxes and are mostly agnostic to the life cycle management of the hosted application’s stack. The stack runtime is executing in the customer’s container (Java, php, Ruby, etc) and is managed by the customer. Since the application is completely controlled by the customer, the application level security is also the responsibility of the customer.
It should be the onus of the web application development to architect the application deployed in the cloud to be able to handle Internet threat model. Countermeasures to handle security has to adhere to some of the standards like OWASP top ten. (The 2010 release candidate spec for OWASP is here). Customers should design and implement applications with a “least-privileged” runtime model. (description of least-privileged).
The architecture of IaaS hosted application resembles enterprise web application model. However, in an enterprise, distributed applications run with many controls in place to secure the network connections. Comparable controls might not exist in in an IaaS platform.
2. PaaS: Since the cloud service providers are providing the platform, they provide the necessary mechanism to secure the platform stack including the runtime engine ( the customers seldom have control over the platform). Vendors are reluctant to expose the technical details of the platform to prevent attacks. But usually these are multi tenant platforms. Therefore the core security tenants are isolation and containment of the applications from each other. The security provisioning is rather proprietary in PaaS. e.g. Google App Engine gives HTTS support and Force.com offers Apex API to configure security parameters. Essentially the broad choices are: SSL, user authentication using service providers user store, and basic privilege management.
3. SaaS: The service provider owns the entire stack in this case and is responsible for providing a secure stack. The customer can usually manage the security policies like user access rights and role assignments. In some cases, the customers might have access to read write access at object level. There might be security glitches even when the provider controls and hosts a sophisticated stack. For example Google had a slight glitch. And again.
To achieve maximum economies of scale, the providers might be hosting the customers on the same virtual box, separating the data only though program logic (tags, etc). Therefore the customers have to be cognizant that security violations might occur due to a bug in the code.
In summary, depending on IaaS, PaaS or SaaS, the responsibility of security provisioning changes. Customers have most of the responsibility in the case of using IaaS, and it is service providers responsibility to provision a secure app all the way in a SaaS model
Random stuff on jmeter testing
Detection
- You want to find out if your application behaves correctly when accessed by multiple threads. Whether your database starts showing deadlocks or whether you might have race conditions.
- You want to find out if your application has memory leaks.
- You want to find out how your application behaves under standard load or how it behaves under some peak load(e.g. shopping during the holidays). Response times etc.
- In the specific case of Java based web applications you want to know how often your GC cycles might run or how long.
- You application makes remote calls(e.g. webservice calls) and you want to know whether all the resources are recovered correctly. perhaps you have some throttling mechanisms in place and you want to see that its working correctly
Verification
- You want to verify the result of changing some tuning parameters.
- You want to check your SLA's are met
Analysis
- You might already know there is a problem and you want to simulate load while you are profiling the application.
These areas overlap and I've used the above categories quite broadly to merely illustrate that your objective and hence your test script will vary based on your objective.
For e.g.
You want to find out if your application behaves correctly when accessed by multiple threads - In this case your test script would only be concerned with running some parts of the applications at exactly the same time. You'd want to exercise multiple parts of your application. Perhaps you are aware that some part of the application internally spawns threads and you'd run a test that exercise that area for a long time or with a high load. You don't at this point really care whether these are unrealistic scenarios or non representative scenarios, nor are you really looking at what the response times are. All you care about is do you see stuck threads or deadlocks. Do you see a really long wait time for most threads though some threads finish really fast.
Or perhaps you want to tune a memory parameter and you want to verify the change
In this case Response Times / Throughput really matter (for the same test of course). You'd first take a baseline reading without the tuning, and then another with the tuning. The test scripts must be representative of actual user behavior.
Perhaps you want to check whether your site can handle holiday shopping onslaughts. In this case you would modify your tests to show bursts of activity, you'd closely monitor response times but you also want to check what happens on the server. How much memory, How much CPU. You might also want to see what load might actually bring down your servers. You might want to check if your load balancers evenly distribute the load.
Or perhaps you have certain Service Level Agreements and you need to know response times accurately for the load specified in your agreement. In this case you need a representative user journey and you also need representative background users.
All of which means there is no easy answer to ' How do I analyse my application with JMeter'. It can only be answered by What is it that you want to analyse (normally answered as well, performance).
Lets take the most common use case, what is the 'response time' for my application.
However actually getting the response time is more difficult than reading the response time calculation from the JMeter test results.
This is problematic due to
a. JMeter is not a browser and does not render the page. Different browsers take different times to render the same page. Compare older versions of Internet explorer with Chrome for e.g.
b. A returning user with some files cached will probably show lesser times than a first time user.
c. The network / connection speed from which the user is accessing the application may be significant. And your users may be spread out throughout the world.
d AJAX based applications / DHTML applications are difficult to predict because not only does it vary by browser , but the number of calls that a browser may make in parallel is also different, but some calls will be made in parallel and its difficult to know that.
So any response time would have (roughly speaking)
a. The time it takes for the application to actually respond with all the data
b. The time it takes for this data to be transferred over the network
c. The time it takes to download static files (bearing in mind that not all files may be downloaded and that browsers may request multiple static files in parallel)
d. The time it actually takes to render the page.
JMeter can help you out with a, b, and c. but what it is good at, is finding out a. for the network on which it is running on.
Typically your requirements might define an Service Level agreement for your site as Browsing operations must take < 6seconds 90% of the time and shopping operations must take <8 seconds 90% of the time. You also know how much large your pages are and you can guesstimate how much time it would take for the page to be transferred over the internet. You might take an average with some safety factor or you might take a worst case scenario. Using a browser tool like YSlow or Googles PageSpeed , you can also have some insight on how your static are downloaded , how long they take etc. And you might add some time for how long the browser takes to render. After considering all of this you might arrive at a new figure that on a high bandwidth intranet (which thereby eliminates most of the network variables) your browsing operations must take < 2 seconds just to get the data and your shopping operations must take < 4 seconds for your SLA's to hold because the rest of the time has already been used by the other factors.
After this you would have to write a script which generates representative loads (for the operations being verified and the operations that would happen in the background), run the test and verify the 90% percentile lies below the value you have calculated above. But perhaps it doesn't. Static files can be optimised by reducing their number, their size, gzipping them adding expiry headers etc, but maybe you have already done this. The Clients network and browser aren't within your control so there isn't much you can do there. The next step is figuring out where your problem lies. JMeter can't help you there, you need a different set of tools. But JMeter can help you to simulate load or parts of it so that you can monitor your application with the tools of your choice. Some of your findings may be infrastructure related, Some may be code you'd have to make changes and retest and repeat.
Infrastructure availability and SLA
If you have had to deal with SLA (service level agreements) from your internal IT or an external vendor, the availability (uptime) is one of the first questions. This is a number usually presented as a percentage (usually 99.9%. Adding or removing a few 9 after decimal might not seem to matter much, but in reality it could. The table depicts the points:
Total downtime (HH:MM:SS)
Availability Per day Per month Per year 99.999% 00:00:00.4 00:00:26 00:05:15 99.990% 00:00:08 00:04:22 00:52:35 99.900% 00:01:26 00:43:49 08:45:56 99.000% 00:14:23 07:18:17 87:39:29While a 99% uptime SLA might not be acceptable because it is still high (14 minutes downtime per day), paying for 99.999% uptime might be an overkill as well. The most common SLA numbers I have seen are 99.9% or 99.99% . Hysterically, though I have also seen 100% uptime SLA, but these companies are only asking for trouble for themselves, because I do not believe 100% uptime is practical even with every high availability algorithm and network architecture in place.
Hopefully these numbers are useful when you have to decide next time.
EC2, GoGrid comparison
I saw the Gartner report on Cloud providers recently. Having worked on EC2 a bit, I am more familiar with EC2 API and features. But I have colleagues who are currently struggling with Terramark integration. Terramark is using vCloud API from VMWare. But given how nascent vCloud APIs are currently, they are mostly scripts that one has to download and consume. EC2 API on the other hand are far more polished and they have been quicker to respond to customer demands and requirements.
I am not necessarily advocating EC2, but I find it more favorable than some of the other cloud providers so far.
Based on the instance definition of EC2 and their pricing model, they seem to be cheaper than GoGrid pricing
Sample pricing from some of the options from both providers is listed as under (as of this post):
processor RAM disk size cost per hour $ GoGrid 1 x Xeon 0.5 GB 30 GB 0.0950 6 X Xeon 8 GB 480 GB 1.5200 Amazon 1 virtual core 1.7 GB 160 GB 0.0850 2 virtual cores 7.5 GB 850GB 0.34Tip #5: How to run a test plan for a certain amount of time
- by specifiing testing duration on the Thread Group GUI (Scheduler checkbox must be checked). Note that you must specify the value of Startup delay option as zero otherwise you will need to specify Start Time option value before each test running. And of course you need to check Forever checkbox of Loop Count option. There is 10 minutes test example on the screenshot below.
- by creating extra Thread Group with two Test Action samplers. The first Test Action configured as pause, and the seconf one is configured to stop all treads. Look at the screenshots below for 10 minutes test example.
Tip #4: Using JMeter properties
There are JMeter properties used as values of "Number of Threads (users)" and "Loop Count" options: ${__P(users)} and ${__P(count)}. Also we must add two parameters to JMeter command line:
jmeter -t TestPlan.jmx -Jusers=10 -Jcount=50
Now you can specify necessary parameters on the fly. I recommend you to use properties for the following options: number of users, loop count, host, port, results and data filenames, etc.
Not seeing CCK fieldgroup in form_alter
I was trying to create different tabs (using YUI) to edit a form. I wanted to create a separate tab for each of the field groups. However, in the form_alter method of the form for my module, I could not see the fields in the fieldgroup.
On debugging I found my fields in the fieldgroup to be present in the form_alter of other modules.
Problem: The CCK fieldgroup module has a weight of 9. If your module has a weight less than 9, then it is called before CCK fieldgroup, hence the problem.
Solution: Set your module’s weight more than 9. The setting is in the system table, under the column weight.
This fixed the problem for me.
It took a long time to solve the problem and thanks to some of the other posts mentioned below for writing about this.
1. Benjamin
2. Andy
JSP Interview questions
2. What are the various way's you have implemented security in your web application. Also mention alternatives that can be used
3. When would you use servlets. name one thing you can do with a servlet that you cannot do with a JSP.
4. Which framework have you used. Please describe the shortcomings in the framework and the ways to work around them
5. What is the difference between a static include, jsp:include, jsp:forward and redirect. Please give one example of each.
6 Describe some Tag Libraries you have written. What are advantages / disadvantages of tag library. If possible give an example of a badly written open source JSP tag library(or any of your own) and how you would improve it.
7. Give one example of when you would use a filter. How do you execute a filter after the request?
8. Give one example of when you would use a listener. Which listeners have you used in your project?
9. Besides defining servlets , name one feature that can be specified in the web deployment descriptor
10. Name JSP implicit object. Give one example of the use of each implicit object. Is exception an implicit object? If so can i write exception.printStackTrace in any JSP? what will it print?
11. Comment whether a JSP web application works with cookies disabled.
12. Which web container did you use. Please describe how to use a container specific feature.
13. Describe some JSTL tags you have used and what they are useful for
14. How do you use a datasource in a JSP?
15. How do you internationalize a web application. What all do you need to take care of
Tip #3: Pauses in test plan
- Constant pause. Just add Test Action controller after the Transaction controller. Specify the pause duration in ms and be sure that "Pause" item of "Action" option is checked.
- Variable pause. Add Test Action as described above but specify the pause duration as 0 (zero). Then add Uniform Random Timer as a child of Test Action and specify the minimum value and maximum offset value.
Let the data in cloud rest securely
One of the prime concerns of anyone using the public cloud (like Amazon EC2, etc) is the security of the data stored in the physical cloud. Data security is of concern at both stages
- Data-at-rest: Stored data on the physical storage volumes
- Data-in-transit: While the data is being transferred between servers.
While the Data-in-transit can be secured using HTTPS, FTPS, etc, the data-at-rest is more tricky to store as encrypted.
Encrypting and decrypting all data at all times during runtime can be a fairly expensive strategy. And at the same time, is necessary if data security is required. i.e. If the data is encrypted before being stored, it has to be unencrypted before being consumed by the application calls.
Moreover, data encryption will not work for the companies which want their data to be indexed by search engines (fortunately, the problem is also less severe for them).
Data encryption is not just important because it can be compromised but also because of another factor called Data Remanence. Unless adequate measures are taken, data that has supposedly been removed from physical storage might continue to persist, albeit partially. NIST has a guidelines for data sanitization, which can be followed as a guidelines by cloud providers, but to my knowledge no cloud provider currently provides this SLA. Also, to my knowledge, Amazon EC2 does not provide encryption on its EBS volumes.
The cloud providers are still evolving and data security of data-as-rest is still an open issue. However, I did stumble upon a very interesting research project being conducted by IBM and Stanford.
This research is still nascent, but if homomorphic encryption can really be applied then data would not need decryption. This can boost both performance and security. Performance because data would not toggle between unencrypted (while being handled by the code) and encrypted (while being stored). Security because at no point the data is unencrypted, therefore also alleviating the data remanence problem.
Video: Using open source tools for performance testing
JMeter Graphs and ANT
The sample report that I generated is shown below
The custom report hyperlinks the titles of the normal summary tables to lead to graphs for the same.
To implement this we need to
a. Allow the graph code to be invoked from ANT. This can be done by writing a simple java class with a main method that passes parameters on the command line, or we could write a custom ant task. I wrote a custom ant task as a proof of concept. We need to customise the build script as well
b.Modify the XSLT to write out image anchors.
These steps are described below.
The custom ant Task
public class AggregateGraphTask extends Task {
private String outputDir;
private String outputFilePrefix;
private Boolean showThreshold = Boolean.TRUE;
private Double threshold = 500D;
private String jmeterResultFile;
private String jmeterHome;
public String getJmeterHome() {
return jmeterHome;
}
public void setJmeterHome(String jmeterHome) {
this.jmeterHome = jmeterHome;
}
public String getJmeterResultFile() {
return jmeterResultFile;
}
public void setJmeterResultFile(String jmeterResultFile) {
this.jmeterResultFile = jmeterResultFile;
}
public String getOutputDir() {
return outputDir;
}
public void setOutputDir(String outputDir) {
this.outputDir = outputDir;
}
public String getOutputFilePrefix() {
return outputFilePrefix;
}
public void setOutputFilePrefix(String outputFilePrefix) {
this.outputFilePrefix = outputFilePrefix;
}
public Boolean getShowThreshold() {
return showThreshold;
}
public void setShowThreshold(Boolean showThreshold) {
this.showThreshold = showThreshold;
}
public Double getThreshold() {
return threshold;
}
public void setThreshold(Double threshold) {
this.threshold = threshold;
}
@Override
public void execute() throws BuildException {
try {
GraphClient.init(jmeterHome);
String outputPrefix = outputDir + File.separator + outputFilePrefix;
if(Boolean.TRUE.equals(showThreshold)) {
GraphClient.writeAggregateChartWithThreshold(jmeterResultFile, outputPrefix, showThreshold, threshold);
} else {
GraphClient.writeAggregateChart(jmeterResultFile, outputPrefix) ;
}
} catch(Exception e) {
throw new BuildException(e);
}
}
}
This is a pretty straightforward class which calls our API's based on parameters passed to it. It declares fields for all the attributes it expects and then calls the Graph API's.
The ANT build.
I assume you have JMeter working from ANT, this assumes all the libraries needed for ANT and JMeter are in place
<project name="Jmeter" basedir="." default="runOfflineGraph">
<property name="lib.dir" value="${basedir}/lib"/>
<property name="report.dir" value="${basedir}/report"/>
<property name="styles.dir" value="${basedir}/styles"/>
<property name="export.dir" value="${basedir}/export"/>
<property environment="env"/>
<property name="jmeter.home.dir" value="${env.JMETER_HOME}"/>
<property name="jfreechart.home.dir" value="${env.JFREECHART_HOME}"/>
<path id="run.classpath">
<fileset dir="${jmeter.home.dir}" includes="**/*.jar"/>
<fileset dir="${jfreechart.home.dir}" includes="**/*.jar"/>
<fileset dir="${lib.dir}" includes="*.jar"/>
</path>
<target name="clean">
<delete dir="${report.dir}" />
<delete dir="${export.dir}" />
</target>
<target name="init">
<mkdir dir="${report.dir}" />
<mkdir dir="${export.dir}" />
</target>
<target name="runJMeter" depends="init">
<taskdef
name="jmeter"
classname="org.programmerplanet.ant.taskdefs.jmeter.JMeterTask"/>
<taskdef name="aggregatechart" classname="org.md.jmeter.ant.AggregateGraphTask" classpathref="run.classpath"/>
<tstamp />
<property name="uniqueTStamp" value="${DSTAMP}${TSTAMP}" />
<property name="imageNamePrefix" value="AggregateChartThreshold-${uniqueTStamp}" />
<property name="jmeter.result.fileName" value="${run.test.report}-${uniqueTStamp}" />
<property name="jmeter.result.file" value="${report.dir}/${jmeter.result.fileName}.jtl" />
<jmeter
jmeterhome="${jmeter.home.dir}"
testplan="${run.test.plan}"
resultlog="${jmeter.result.file}">
<property name="jmeter.save.saveservice.output_format" value="xml"/>
<property name="run.threadcount" value="${run.threadcount}" />
<property name="run.loopcount" value="${run.loopcount}" />
<property name="sample_variables" value="${sample_variables}" />
</jmeter>
<xslt
in="${jmeter.result.file}"
out="${report.dir}/${jmeter.result.fileName}.html"
style="${styles.dir}/${xsl.file}">
<param name="imageNamePrefix" expression="${imageNamePrefix}"/>
<aggregatechart jmeterHome="${jmeter.home.dir}" jmeterResultFile="${jmeter.result.file}" outputDir="${report.dir}"
outputFilePrefix="${imageNamePrefix}" showThreshold="true" threshold="200"/>
</target>
<target name="runOfflineGraph" depends="init">
<antcall target="runJMeter">
<param name="run.test.plan" value="OfflineGraphs.jmx"/>
<param name="run.test.report" value="OfflineGraph"/>
<param name="sample_variables" value=""/>
<param name="run.threadcount" value="1"/>
<param name="run.loopcount" value="5"/>
<param name="xsl.file" value="OfflineGraph.xsl" />
</antcall>
</target>
</project>
Important points are
a) we define a run.classpath which has everything we need at runtime to generate the graphs.
b) we have a taskdef aggregatechart for our custom task
c) we invoke the custom chart by passing it the parameters we need. These are closely linked with the previous steps in the build. The result jog from jmeter (${jmeter.result.file}) is passed as an input to the task. The image file names to be generated are important ${imageNamePrefix} as we need to reference this in the stylesheet. The directory to which the Graph code writes must be the same as the XSLT output (or atleast the XSLT and Graph code must be consistent in where the images are referenced from in the HTML)
The XSLT stylesheet
The changes here are pretty straightforward. I've copied extras/jmeter-results-report_21.xsl and renamed it to OfflineGraph.xsl.
The important changes are
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:param name="imageNamePrefix">AggregateChartThreshold</xsl:param>
We pass in a parameter that is to be used while generating the img tag in the HTML
<xsl:call-template name="summary" />
<hr size="1" width="95%" align="left" />
<xsl:call-template name="pagelist" />
<hr size="1" width="95%" align="left" />
<xsl:call-template name="detail" />
<xsl:call-template name="graph-images" />
We call our custom template
<th><a href="#AverageGraph">Average Time</a></th>
<th><a href="#MinimumGraph">Min Time</a></th>
<th><a href="#MaximumGraph">Max Time</a></th>
We change the titles to be anchors (named anchors that are next to the images)
<xsl:template name="output-image">
<xsl:param name="suffix" />
<xsl:element name="img">
<xsl:attribute name="src"><xsl:value-of select="$imageNamePrefix" />-<xsl:value-of select="$suffix" />.png</xsl:attribute>
</xsl:element>
</xsl:template>
<xsl:template name="graph-images">
Graphs
<br />
<b>Minimum</b>
<br />
<a name="MinimumGraph"></a>
<xsl:call-template name="output-image">
<xsl:with-param name="suffix">Min</xsl:with-param>
</xsl:call-template>
<br />
<b>Maximum</b>
<br />
<a name="MaximumGraph"></a>
<xsl:call-template name="output-image">
<xsl:with-param name="suffix">Max</xsl:with-param>
</xsl:call-template>
<br />
<b>Average</b>
<br />
<a name="AverageGraph"></a>
<xsl:call-template name="output-image">
<xsl:with-param name="suffix">Avg</xsl:with-param>
</xsl:call-template>
<br />
<b>Median</b>
<br />
<a name="MedianGraph"></a>
<xsl:call-template name="output-image">
<xsl:with-param name="suffix">Median</xsl:with-param>
</xsl:call-template>
<br />
<b>90 percentile</b>
<br />
<a name="NinetyPerGraph"></a>
<xsl:call-template name="output-image">
<xsl:with-param name="suffix">90</xsl:with-param>
</xsl:call-template>
</xsl:template>
Finally we output img tags. The knowledge of how the Aggregate Graph generates the suffix for the images is hardcoded into the stylesheet. The passed parameter is used to form the filename as well (this is used to allow multiple runs , so that the images don't overwrite previous ones)
Running the ANT build
This is the run.cmd I use (windows only)
set JAVA_HOME=C:\bea102\jdk150_11
set JMETER_HOME=C:\projects\R1-Portal-CMS\test\jakarta-jmeter-2.3.4
set JFREECHART_HOME=C:\work\java\jfreechart-1.0.13
set ANT_HOME=C:\work\java\apache-ant-1.7.1
set PATH=%JAVA_HOME%\bin;%ANT_HOME%\bin;%PATH%
set CLASSPATH=%JMETER_HOME%\extras\ant-jmeter-1.0.9.jar;%CLASSPATH%
ant %*
We set some environment properties that the build needs and run it.
Source Code is available here
Aggregate Graphs in JMeter using JFreeChart (3D Bar Charts) with thresholds
Sample Images
Sample Code
public static void writeAggregateChartWithThreshold() throws Exception {
File f = new File(JMETER_RESULT_FILE);
ResultCollector rc = new ResultCollector();
AggregateChartVisualizer v = new AggregateChartVisualizer(ConfigUtil
.getOutputGraphDir()
+ "/AggregateChartThreshold",true,500);
ResultCollectorHelper rch = new ResultCollectorHelper(rc, v);
XStreamJTLParser p = new XStreamJTLParser(f, rch);
p.parse();
v.writeOutput();
}
Source code available here
