Elasticsearch, Isomorphic JavaScript, Presentations, Search

ZendCon 2017

I'm excited to once again be presenting at ZendCon 2017. This year I'll be doing two talks.

On Tuesday, October 24, I'll be presenting "Isomorphic WordPress Applications using NodeifyWP" which will cover isomorphic JavaScript in WordPress, specifically NodifyWP, Twenty Sixteen React, and the NodifyWP Environment. Here are my slides.

On Thursday, October 26, I’ll be presenting “Transforming WordPress Search and Query Performance with Elasticsearch “. This talk will cover Elasticsearch, ElasticPress and WordPress. Here are my slides:

Standard
Isomorphic JavaScript, JavaScript, Node.js

Isomorphic WordPress Applications using NodeifyWP at WordCamp Lancaster 2017

This weekend I'll be presenting on "Isomorphic WordPress Applications using NodeifyWP". I'll also be discussing the NodeifyWP Environment – a Dockerized environment pre-setup for running NodeifyWP applications.

Standard
Isomorphic JavaScript, WordCamps, WordPress Plugins

NodeifyWP and Twenty Sixteen React Debut at WordCamp Denpasar

Today, I am speaking at the inaugural WordCamp Denpasar. I will be giving a talk on NodeifyWP and Twenty Sixteen React, giving the debut demo of the framework.

NodeifyWP is a framework, created by 10up, for creating isomorphic JavaScript applications within PHP and WordPress. Twenty Sixteen React is an example theme using the framework along with React.js and Redux.

Here are the slides for my talk:

Standard
Compression Algorithms

Data Compression, PHP, and the Web

Data compression is an extremely important topic in modern computing, networking, and software engineering. Sharing information faster and in smaller sizes across a network is a boundary that will continue to be pushed as long as computers and the internet exist. Large companies like Google and very smart people have continuously refined and created new algorithms to make things smaller. Better compression algorithms not only make companies profit but have implications on low bandwidth users, critical health data, financial data, etc. The topic is so important, HBO even created a show about it!

Let’s discuss some compression basics as it relates to the web, networking, and PHP.

By far the most used compression technique is deflate which powers zip, gzip, and zlib. Gzip compressed data can be decompressed by modern browsers on the fly. Gzip compression is lossless, meaning the original data can be fully recovered during decompression. Due to it’s power and widespread browser support, it’s almost a standard that we must gzip a websites contents before returning that information to the browser. Here’s how that typically looks:

Browser requests web page -> Nginx receives request -> PHP output is generated/static file is returned -> Nginx gzip's the output and responds to the browser -> browser decompresses the data for the end user

Here’s a useful article on enabling compression in nginx.

Recently, Google and Facebook have released their own compression algorithms. Google’s algorithm, Brotli, is another lossless compression solution. In a paper comparing compression algorithms, Brotli’s compressed data (at maximum level) is about 30% smaller or denser than gzip.

However, when looking at compression algorithms, we can’t just look at density (also referred to as compression ratio). We also have to consider compression and decompression speed. If our algorithm produces denser data but takes a month to compress, what have we really accomplished? In the paper referenced above, Brotli, performs about the same as gzip in compression and decompression time.

Zstandard is a lossless compression algorithm announced by Facebook in August 2016. Facebook is touting Zstandard to be a solid balance between compression ratio, compression speed, and decompression speed and a big step forward in modern computing.

Let’s look at some benchmarks: (table columns in order are: Plugin, Codec, Level, Compression Ratio, Compression Speed, and Decompression Speed)
benchmarks

This benchmark was produced by Squash Compression Benchmark on a 122 KB text file.

The results show Brotli has the best file density (compression ratio) while Zstandard has the worst. Zstandard has the fasted compression speed by far while Brotli has the slowest. I ran some of my own tests locally just on compression ratio:

Original Gzip (level 9) Brotli (level 11) Zstandard (level 22)
Webpage 1 44.05 KB 14.45 KB 12.67 KB 14.05 KB
Webpage 2 176.26 KB 175.98 KB 176.27 KB 176.28 KB
Webpage 3 208.38 KB 57.09 KB 47.76 KB 52.14 KB
Webpage 4 237.4 KB 39.07 KB 29.81 KB 33.28 KB
Webpage 5 191.72 KB 35.97 KB 28.64 KB 32.38 KB
Webpage 6 113.45 KB 16.22 KB 12.88 KB 15.05 KB
Webpage 7 533.23 KB 106.87 KB 84.02 KB 92.93 KB
Webpage 8 146.41 KB 27.86 KB 22.59 KB 25.08 KB
Webpage 9 30.54 KB 6.69 KB 5.4 KB 6.53 KB
Webpage 10 47.92 KB 10.23 KB 8.22 KB 9.86 KB
Webpage 11 116.57 KB 22.35 KB 18.55 KB 20.87 KB
Webpage 12 217.89 KB 36.57 KB 26.93 KB 30.44 KB

 

Average gzip compression ratio: 4.73
Average Brotli compression ratio: 5.91
Average Zstandard compression ratio: 5.21

So what does this all mean and how does it relate to the web, networking, and PHP?

Well, in the context of serving assets on the web, without a better compression ratio it’s unlikely that anything will unseat gzip. Therefore while Zstandard’s compression speed is very impressive, it is not useful for serving websites. Morever, modern browsers can all decompress gzip on the fly. There is no browser support for Zstandard. That being said, one can still use PHP and the zstd extension to compress and decompress files server side.

Brotli, on the other hand, does have a better compression ratio than gzip (and Zstandard). Google claims Brotli’s ratio is about 20-30% higher. Compression ratio improvements are heavily influenced by the type of file being compressed. The tests I ran (table above) show an average compression ratio improvement of about 24%. However, Brotli’s compression speed is about half that of gzip. However, for smaller file sizes (web pages), the compression ratio improvement trumps the loss in compression speed. Brotli is superior than gzip for serving web assets.

Brotli, unlike gzip, is not universally supported by browsers. In fact as of now it is not supported by Safari or IE/Edge but only new versions of Chrome and Firefox. Also, Brotli will only be properly decoded by browsers when served over https. There is a PHP extension for compressing as well as an nginx module.

As of today, Brotli is ready and worth it for production use based on my tests. We can use PHP to compress page cached files and decompress on the fly (perhaps an addition to Simple Cache) or use nginx to detect browser capabilities and serve Brotli compressed files accordingly. The nginx method is an easy win since all we need to do is compile the Brotli module in nginx and tweak our configuration file.

Shawn Maust wrote a nice article on compiling nginx with the Brotli module. I also wrote an nginx config file that let’s you enable Brotli with PHP7 FPM but fall back to gzip for non-supporting browsers.

Compression algorithms will continue to be iterated on and improved. For now, we can improve experiences for users and decrease bandwidth usage with Brotli.

Edit: The compression levels used for my tests were 9 for gzip, 11 for Brotli, and 22 for Zstandard.

Standard
Presentations

WordPress Best Practices for Enterprise at Music City Code

Today I'm presenting on Best Practices for WordPress in Enterprise at Music City Code 2016. Here are a few of the topics that will be covered:

  • – Caching for high traffic situation
  • – Security techniques
  • – Writing maintainable/extensible code
  • – Optimizing database reads/writes
  • – Search
  • Teamwork in software development
  • – Browser performance
  • – Workflows
  • – Utilizing third party libraries

Slides for anyone who needs them:

This presentation is based on 10up’s amazing Best Practices.

Standard
WooCommerce, WordPress Code Techniques, WordPress Core

post_class() and get_post_class() – Performance Killers for WordPress and WooCommerce

The get_post_class() function is a WordPress function commonly used within post “rivers”. For example, if I had a list of posts, WooCommerce products, or any content type really, I might have some code like this:



<div >

Note: post_class() just calls get_post_class() and outputs it to the browser.

post_class() will output something like class="post has-post-thumbnail type-POST-TYPE status-POST_STATUS tag-TAG1 tag-TAG2 category-CATEGORY1 category-CATEGORY2 ...."

The classes added make it easy to style content that has a specific taxonomy term, has a thumbnail, a particular status, etc.

However, the queries needed to determine all this information are not cheap. Moreover, this function is probably called for every post you’re listing. So if you have posts_per_page set to 20, this function will be called 20 times.

Let’s take a look at the function’s code (I’ve trimmed some of the comments):

function get_post_class( $class = '', $post_id = null ) {
	$post = get_post( $post_id );

	$classes = array();

	if ( $class ) {
		if ( ! is_array( $class ) ) {
			$class = preg_split( '#s+#', $class );
		}
		$classes = array_map( 'esc_attr', $class );
	} else {
		// Ensure that we always coerce class to being an array.
		$class = array();
	}

	if ( ! $post ) {
		return $classes;
	}

	$classes[] = 'post-' . $post->ID;
	if ( ! is_admin() )
		$classes[] = $post->post_type;
	$classes[] = 'type-' . $post->post_type;
	$classes[] = 'status-' . $post->post_status;

	// Post Format
	if ( post_type_supports( $post->post_type, 'post-formats' ) ) {
		$post_format = get_post_format( $post->ID );

		if ( $post_format && !is_wp_error($post_format) )
			$classes[] = 'format-' . sanitize_html_class( $post_format );
		else
			$classes[] = 'format-standard';
	}

	$post_password_required = post_password_required( $post->ID );

	// Post requires password.
	if ( $post_password_required ) {
		$classes[] = 'post-password-required';
	} elseif ( ! empty( $post->post_password ) ) {
		$classes[] = 'post-password-protected';
	}

	// Post thumbnails.
	if ( current_theme_supports( 'post-thumbnails' ) && has_post_thumbnail( $post->ID ) && ! is_attachment( $post ) && ! $post_password_required ) {
		$classes[] = 'has-post-thumbnail';
	}

	// sticky for Sticky Posts
	if ( is_sticky( $post->ID ) ) {
		if ( is_home() && ! is_paged() ) {
			$classes[] = 'sticky';
		} elseif ( is_admin() ) {
			$classes[] = 'status-sticky';
		}
	}

	// hentry for hAtom compliance
	$classes[] = 'hentry';

	// All public taxonomies
	$taxonomies = get_taxonomies( array( 'public' => true ) );
	foreach ( (array) $taxonomies as $taxonomy ) {
		if ( is_object_in_taxonomy( $post->post_type, $taxonomy ) ) {
			foreach ( (array) get_the_terms( $post->ID, $taxonomy ) as $term ) {
				if ( empty( $term->slug ) ) {
					continue;
				}

				$term_class = sanitize_html_class( $term->slug, $term->term_id );
				if ( is_numeric( $term_class ) || ! trim( $term_class, '-' ) ) {
					$term_class = $term->term_id;
				}

				// 'post_tag' uses the 'tag' prefix for backward compatibility.
				if ( 'post_tag' == $taxonomy ) {
					$classes[] = 'tag-' . $term_class;
				} else {
					$classes[] = sanitize_html_class( $taxonomy . '-' . $term_class, $taxonomy . '-' . $term->term_id );
				}
			}
		}
	}

	$classes = array_map( 'esc_attr', $classes );

	$classes = apply_filters( 'post_class', $classes, $class, $post->ID );

	return array_unique( $classes );
}

Within this code, the following functions might result in database queries: get_post_format, has_post_thumbnail, is_sticky, and get_the_terms. The most expensive of these queries is get_the_terms which for each taxonomy associated with the post type, selects all the terms attached to the post for that taxonomy. If there are four taxonomies associated with the post type being queried, get_post_class could result in 7 extra database queries per post. With 20 posts per page, that’s an extra 140 queries per page load! On WooCommerce sites where there are many taxonomies and usually many products per page being shown, this is a huge performance killer. Yes, object caching (and page caching of course) will improve our eliminate some of the database queries, but people will still be hitting the cache cold sometimes.

Solution:

Don’t use get_post_class or post_class. It’s not that important. 99% of people don’t use the tags it generates. What I do is output the function, inspect the classes it adds using Chrome, and hardcode the classes actually referenced in CSS into the theme.

PS: body_class() is much less query intensive and okay to use.

Standard