Common API Performance Techniques

by Emanuela Hedrick.

Share
|
Homepage | Submit your article | Contact | TOS
More articles on apis and web feeds  

You are here: Categories » Internet » APIs and Web Feeds

Websites are designed to be accessed by individuals, and as such tend to rely on the relatively slow speed of the user to avoid any performance bottlenecks. This technique fails miserably with APIs because they are going to be consumed by other servers with high-speed connections, often designed only with their own performance in mind (they won't cache your responses for you, and will instead make exactly the same request time and time again). Designing your API with performance in mind can help keep the server fast even when many requests are being made, and will help ensure that future hardware upgrades can accomplish their desired tasks.

Note

Many websites are either designed poorly or appear to lack any sort of design whatsoever. I've seen a site that required 10 database queries to start the page, then an additional query for every item in their database. With more than 40 items in their database, there were approximately 50 queries being made every time the index page loaded. This technique was failing horribly for a website receiving relatively few hits. It wouldn't have lasted minutes if it was consumed automatically, and I doubt it would have lasted more than a few seconds under the Slashdot effect. All these database queries were basically pointless as well; the firm's inventory changed slowly, so a static page generated once per week from the same script would have functioned just as well for the end user, but would have been several orders of magnitude faster.

Caching Data

Often both websites and APIs request data from the database each and every time a request is made, even though the data used to populate the response changes rarely. This, combined with the database normalization techniques taught since the beginning of time, means that each of those requests is likely making at least one query joining results from multiple tables, possibly multiple queries. If your data isn't changing that often, consider caching the response.

For example, take the fictional Bob's Video website. Every time someone either views detailed information about a movie on his website or requests it through his API, his server runs three queries: one query that finds the movie's full title, plot line, and rating, another query that runs a joined query to retrieve detailed information on each of the cast members, and a final query to determine the film's rental status. This is a gigantic waste of resources; once a movie is released, the only response that will change is its rental status. Yet, each and every time the page is loaded, the data is requested again from the database. It would make far more sense to either use a static page for released films (populating rental status dynamically), or at the very least cache all the film's information and retrieve the rental status dynamically.

Note

You've probably noticed that, because the cache will likely end up in a database, I've really only reduced the query count from 3 to 2. It doesn't look like a drastic improvement, but it is. The joined query looking up detailed information for the cast members is going an order of magnitude slower than a lookup based on a primary key, so there is a big saving there. You can also cache the movie data in a form close to its final web form, saving on all the processing generally needed to go from database to web page. You will need two caches in this example, one for the website and one for the API.

Smarter Use of Database Queries

Although caching data is an excellent method of reducing the number of queries you use, it isn't always appropriate. Just make sure you are getting the most out of each query you run. Many times duplicate data is requested while handling a single request; this often happens when different functions need the same data, but they don't call each other so they don't share their results. Consider either reworking your script to obtain all required data itself, then pass off data to the functions that require it, or creating an abstraction layer with an object that takes care of obtaining information from the database only when required.

Once you're using your database queries to their fullest, begin work on improving the speed of the queries themselves. Never start queries with SELECT * FROM — request only the fields you actually need. Also examine both your queries and your database. Try to ensure that the fields you base your selection on are either primary keys or at least indexed by the database server.

Response Caching

Consider a case of a Video Store API, which allows users to request information on films. With a small design change (moving rental status to its own query, rather than providing it with each request), many new caching opportunities present themselves. Because the response doesn't change regardless of who requests it, a proxy server can be used server side to handle the response (this is much easier with REST APIs than with SOAP). Setting the appropriate headers for cache life (24 hours for films, and 30 minutes for rental status) will allow the API to shrug off most of its work to the proxy server.

PHP Accelerators

There are a few PHP accelerators available, which can have a drastic effect on the speed of your scripts. Every time a PHP script is executed, it is parsed and compiled into byte code by PHP's scripting engine. Because, generally speaking, the script hasn't changed between executions, this is a huge waste of processing time. PHP accelerators cache the byte code version of the scripts, and execute that copy (being mindful of any changes to the original script). This saves the parse and compilation steps each time the script is executed because your API will be called with great frequency, and changed rarely this can be a significant savings.

It is important to realize how PHP accelerators work to avoid having undue expectations for their results. Consider the parse and compilation time for a script as a fixed cost — every time the script is accessed, regardless of the speed of other resources (databases, for example) or how much processing the finished script requires, this cost must be paid. Caching the byte code copy of the script only saves on that cost; it will not speed your database queries or other CPU-intensive processes.

One of the most prevalent PHP accelerators is from Zend, dutifully titled the Zend PHP [4/5] Accelerator. I found it easy to install and was relatively pleased with its results. Having upgraded to PHP5 shortly after its release, I was unable to test other accelerators that have since become available. One of the other accelerators I did manage to try sigfault'd the calling Apache process on a variety of my scripts, so be sure you test whichever accelerator you use extensively before putting it on the production system.

Leave a comment or ask a question
Total comments: 0

APIs and Web Feeds Disclaimer

  • The e-articles directory is not responsible for any and all copyright infringements by writers and authors. If you suspect the information contained by this page for any copyright infringements, please contact us to investigate the issue
REST API vs SOAP API technology - The two primary architectures for APIs are REST and SOAP. When creating your API, you really have three options: REST, SOAP, or both. REST APIs are known for being easy and quick to develop for, bu (more...)
How to implement the SOAP technology - Like REST, implementing SOAP involves both generating requests and then handling the response. Whereas handling the SOAP response is similar to the REST result, generating the SOAP request is quite (more...)
Advantages and Disadvantages of Open API - Under an open API, absolutely no security or authentication methods are used. A query is received from the wild, and the system makes its best effort to respond to it appropriately. This has severa (more...)
Introduction to Web APIs ~ REST vs SOAP - When interacting with web services, generally the choice of which method to use will be made for you. The majority of services operate in either REST or SOAP, not both (Amazon is a notable exceptio (more...)
What are Feeds ~ RSS and ATOM Feed Specifications - You can think of feeds as small modules of information that can be plugged into existing websites, consumed by clients on their desktop, or consumed by aggregators to be presented by users with oth (more...)
Important Considerations When Using Feeds - XML feeds provide a great resource of information, but their use is not without its own special considerations. Security and legal concerns go hand in hand whether you are producing or consuming (more...)
Advantages and Disadvantages of Client Side Certificates - The API server can generate a certificate and provide it to the client via a secure channel before any requests are made. This certificate is then used in the authentication process; this confirms (more...)
How to implement the REST technology - There are two sides to this tale, the first is how to generate legitimate REST requests, and the second is how to handle the responses correctly. Generating Requests When i (more...)
Why Do You Need to Produce Feeds - Feeds have several advantages, primarily related to consumption, over traditional HTML formats. Many desktop applications are devoted to reading feeds at regular intervals, and many of the new batc (more...)
How REST Works - Generally speaking, a REST request will involve sending a request to a special URL (similar to what you would see after filling out a form using the GET method), then receiving an XML document cont (more...)

 
free content
    Copyright © 2006 - 2012 e-articles.info.
The texts, articles and tutorials in the directory are property of their respective owners and authors.