Custom, SEO-Friendly URLs for Drupal Exposed Filters Part 2

Pierce Lamb
12 min readMay 15, 2020

This is Part 2 in a two part series where we detail how to create custom URLs for Drupal Views Exposed Filters. Part 1 covers how we create/update and delete these URLs. Part 2 covers how to load and process these URLs

If you have any questions you can find me @plamb on the Drupal slack chat.

Now that we have the original Exposed Filter paths correlated to custom paths in the path_alias table, let’s look at how to load them in the View and make sure they’re loading the right page when clicked on.

When I first worked on this problem, I was using a contrib module for Views called ‘Better Exposed Filters’ (BEF). I made this choice because I wanted to expose the filters as links (verses a <select> dropdown) and this module makes it very easy to do that (via settings -> Exposed Filter Widget). I will continue as if the reader is using BEF, but doing this without BEF should be possible to extrapolate from the below.

In order to load the correct, custom URLs when the View is loaded, I wanted to use a hook_preprocess_hook to hook into the theme that was providing the BEF links. Luckily, BEF has one called hook_preprocess_bef_links. So now in the same module file we used for CRUD, we create function custom_urls_preprocess_bef_links(&$variables)

The beginning of the function looks like this:

function tibco_views_utils_preprocess_bef_links(&$variables){
$analyst_pages = ['related_firm','related_product', 'related_topic'];
$current_name = $variables['element']['#name'];
if(in_array($current_name,$analyst_pages)) {
//We're on an analyst page
$links = $variables['links'];
foreach ($links as $id => $render_arr) {
//looping over the right links
}
}
}

Like in Part 1, we’ve hardcoded in the actual query keys. I’ll restate as before that in a general solution we would get these from loading the View directly, but for the sake of speed we hardcode them here. So we compare the #name to the query keys we care about and if there’s a match we know we’re on one of the /analyst-relations pages. We then grab the links array so we can loop over it.

Inside the loop we do this:

if (is_numeric($id)) {
$url_obj = $render_arr['#url'];
$uri_string = $url_obj->toString();
$not_selected = !in_array('bef-link--selected',$render_arr['#attributes']['class']);
//We don’t want the below to run on the selected link because its correct URL is already in-place
//from our custom processOutbound in NiceUrlsPathProcessor.php
if (!empty($uri_string) && $uri_string !== '/analyst-relations' && $not_selected) {
//We're dealing with the non-selected links, retrieve their path(s) from the cache
//And remove the provided url object
$path = _try_for_fancy_url($uri_string);
unset($variables['element'][$id]['#url']);
$markup = "<a href='" . $path . "'>" . $render_arr['#title'] . "</a>";
$variables['element'][$id]['#markup'] = $markup;
}
}

The actual links themselves are the only array members keyed by numeric values, so we test for is_numeric and if so, run our code. First we grab the string representation of the URL object that the View has created for the link (if you read Part 1, something like /analyst-relations?related_firm=5467). The $not_selected boolean is a bit confusing. This exists because this preprocess runs on both the base View (‘/analyst-relations’) AND after any of the exposed filters are clicked e.g. (/analyst-relations/firm/gartner). When we have an actual ‘selected’ (focused) link in the array of links, it’s already been given it’s custom URL in our custom PathProcessor which we will discuss later, so we want to avoid it. We look for the HTML class bef-link — selected to avoid it (BEF applies this automatically).

So for the second if, we make sure the URL object’s string representation is not empty, it’s not equal to the base view path (avoid the ‘All’ link) and it is not currently selected. Then we call the _try_for_fancy_url function. When that function returns a path, we unset the URL object out of the links render array, and then add the appropriate markup for the link with a custom URL.

The _try_for_fancy_url function is the meat and potatoes of setting the custom URL. It’s skeleton looks like this:

function _try_for_fancy_url($path){
$default_cache = \Drupal::cache();
$maybe_cached_url = $default_cache->get($path);
if($maybe_cached_url == FALSE){
//Does not exist in cache, create it
} else {
//does exist in cache
}
return $path;
}

So we’re going to go to the cache with the old exposed filter’s path, if it returns false we’re going to create an entry for it in the cache, else we’re going to retrieve its fancy URL and return it. The full function looks like this:

function _try_for_fancy_url($path){
$default_cache = \Drupal::cache();
$maybe_cached_url = $default_cache->get($path);
if($maybe_cached_url == FALSE){
//Does not exist in cache, create it
$maybe_new_path = _db_lookup_fancy_url($path);
if($maybe_new_path !== FALSE){
//The fancy URL exists in the database, cache it
$default_cache->set($path,$maybe_new_path,Drupal\Core\Cache\CacheBackendInterface::CACHE_PERMANENT);
$path = $maybe_new_path;
} else {
//no fancy URL for this path, fall back to the original url which was the $path passed in
}
} else {
//does exist in cache
$path = $maybe_cached_url->data;
}
return $path;
}

So if the cache misses, we’re going to do a database lookup to find the fancy URL. If that returns a non-false value, we’re going to set the database result into the cache so this database lookup is avoided next time. We will then set the return variable to this custom path. Note, as the comments show, that should anything go wrong in the fancy URL lookup, the system gracefully degrades to the original old exposed filter url.

Here is what the _db_lookup_fancy_url function looks like:

function _db_lookup_fancy_url($path){
$uri_arr = \Drupal::service('path_alias.repository')
->lookupBySystemPath($path, \Drupal::languageManager()->getCurrentLanguage()->getId());
if ($uri_arr !== NULL) {
$new_alias = $uri_arr['alias'];
if (!empty($new_alias)) {
return $new_alias;
}
}
return false;
}

We utilize a service called ‘path_alias.repository’ to run a straight lookup on the path_alias table using the generated Exposed Filter’s path. You might remember this from Part 1 as something like /analyst-relations?related_firm=5467; the query should return the custom url for that path, if not the function simply returns false. The caching we have put in place keeps us from having to do this lookup every single time the relevant View is loaded. With this setup, this lookup should only occur after caches are flushed and then the cache should be providing the custom URL until the next cache flush.

And just like that, our View will now show exposed filters as links with the href attribute containing our custom, fancy URLs.

Okay cool. But what happens when someone clicks one of the fancy URLs!?

In order to render the response Drupal returns from its internal exposed filter paths, e.g. /analyst-relations?related_firm=5467 but on a custom URL like /analyst-relations/firm/gartner we have to write our own PathProcessor. Drupal’s AliasPathProcessor is part of the path_alias module and a central piece of how Drupal processes inbound URLs to the backend and sends back outbound URLs from the backend. It implements two key interfaces, InboundPathProcessorInterface and OutboundPathProcessorInterface which define the two key methods processInbound and processOutbound. In order to provide a custom extension of this class, in our custom_urls module we are going to write our own PathProcessor that is deployed as a Drupal Service.

In the root of /custom_urls/ we are going to add a file called custom_urls.services.yml. In that file we will add this definition:

services:
custom_urls.path_processor:
class: Drupal\custom_urls\NiceUrlsPathProcessor
tags:
- { name: path_processor_inbound, priority: 100 }
- { name: path_processor_outbound, priority: 300 }

This file tells Drupal that we are defining a service which will construct our custom class, NiceUrlsPathProcessor. The tags path_processor_inbound/outbound and their associated priorities tell Drupal where in the Request/Response stack to fire our custom processInbound/Outbound functions. If we look at /core/modules/path_alias/path_alias.services.yml we’ll see that our service definition is almost identical to the path_alias module:

path_alias.path_processor:
class: Drupal\path_alias\PathProcessor\AliasPathProcessor
tags:
- { name: path_processor_inbound, priority: 100 }
- { name: path_processor_outbound, priority: 300 }
arguments: ['@path_alias.manager']

The only difference is that AliasPathProcessor needs the path_alias.manager service injected into it (the arguments line).

Okay, so now in /custom_urls/src/ we will add NiceUrlsPathProcessor.php. As a stub, that file will look like:

use Drupal\Core\PathProcessor\InboundPathProcessorInterface;
use Drupal\Core\PathProcessor\OutboundPathProcessorInterface;
use Drupal\Core\Render\BubbleableMetadata;
use Symfony\Component\HttpFoundation\Request;
class NiceUrlsPathProcessor implements InboundPathProcessorInterface, OutboundPathProcessorInterface {public function processInbound($path, Request $request) {
return $path;
}
public function processOutbound($path, &$options = [], Request $request = NULL, BubbleableMetadata $bubbleable_metadata = NULL) {
return $path;
}
}

With all of this defined and after flushing caches, you should be able to load urls on your site and notice no changes. Our new processInbound/Outbound are simply returning the paths given to them by AliasPathProcessor. If we insert some debug statements before the return (dd($path), dpm($path) etc) we will be able to see what is being passed to these functions. We notice immediately that they are firing on every single URL we load, so we need to narrow quickly to the exposed filter URLs we care about.

The first thing we’ll want to do is test what is passed to these functions when we try to load a custom URL. With debug statements inserted, we load one of the custom URLs that got stored in our path_alias table, say, /analyst-relations/firm/gartner. If you haven’t inspected Drupal\PathProcessor\PathProcessorAlias.php, you’ll be surprised to find that the $path our custom processInbound receives is actually the internal exposed filter URL, e.g. /analyst-relations?related_firm=5467. Let’s take a look at that file’s processInbound:

public function processInbound($path, Request $request) {
$path = $this->aliasManager->getPathByAlias($path);
return $path;
}

So $path = $this->aliasManager->getPathByAlias($path); is passing /analyst-relations/firm/gartner to aliasManager’s getPathByAlias function and returning that custom URLs exposed filter URL, namely, /analyst-relations?related_firm=5467. This is because of how we stored our custom URLs in the path_alias table in Part 1 of the blog series. This is a great result as it makes our custom processing even easier.

Okay back to our custom processInbound. The first thing we’ll want to do is narrow to URLs that have query parameters, so we will start like this:

public function processInbound($path, Request $request) {
$split_path_raw = explode("?", $path);
$path_has_query = count($split_path_raw) > 1;
if($path_has_query){

We check for a `?` character and make sure the result is greater than 1. This will instantly weed out all the non-query parameter URLs on our site. Next, we’ll want to test the base path in this explode array (the left side of the `?`) for matching a View path that we care about. At this point, I’m going to utilize hardcoding again to speed up things, but as I mentioned in Part 1, a general solution would have loaded the View config and extracted the path and query keys that way. I will issue another warning that our solution will not work if hardcoded and these values change.

Inside NiceUrlsPathProcessor.php we define a private variable with the hardcoded paths associated with their query keys:

private $paths_and_query_keys = [
'/analyst-relations' => ['related_firm'],
'/analyst-relations/product' => ['related_product'],
'/analyst-relations/topic' => ['related_topic'],
];

The query keys are in arrays on the off chance that the View in question uses more than one query key. Okay, so now our processInbound looks like this:

public function processInbound($path, Request $request) {
$split_path_raw = explode("?", $path);
$path_has_query = count($split_path_raw) > 1;
if($path_has_query){
$base_path = $split_path_raw[0];
$has_fancy_url = $this->testPathForFancyURL($base_path);
if($has_fancy_url){
$query_keys = $this->getQueryKeys($split_path_raw[1]);

We take the base path out of the exploded $path variable and pass it to a private function for testing if we’re on a View that has custom URLs. That function is simple:

private function testPathForFancyURL($base_path){
$paths_with_fancy_urls = array_keys($this->paths_and_query_keys);
return $has_fancy_url = in_array($base_path,$paths_with_fancy_urls);
}

If the path does have custom URLs, we get the query keys from the second half of the exploded $path variable:

private function getQueryKeys($split_query){
$query_arr = [];
parse_str($split_query, $query_arr);
return $query_arr;
}

Now we have the query keys and values in their own array, $query_keys. The remainder of the control structure inside `processInbound` will look like this:

$query_key = array_key_first($query_keys);
$query_value = $query_keys[$query_key];
$request->query->set($query_key, $query_value);
return $base_path;

So we get the query key and value. We then take the Request variable passed into our `processInbound` and set its query key and value to the ones we extracted from the $path. Now we can look at the full function:

public function processInbound($path, Request $request) {
$split_path_raw = explode("?", $path);
$path_has_query = count($split_path_raw) > 1;
if($path_has_query){
$base_path = $split_path_raw[0];
$has_fancy_url = $this->testPathForFancyURL($base_path);
if($has_fancy_url){
$query_keys = $this->getQueryKeys($split_path_raw[1]);
//We've matched an incoming fancy URL
//Lines below EXPECT there to only be one query key/value.
//Will need to modify for multiple k/v
$query_key = array_key_first($query_keys);
$query_value = $query_keys[$query_key];
$request->query->set($query_key, $query_value);
return $base_path;
}
}
return $path;
}

Note the comments; array_key_first implies there will only be one query key/value, so this needs to be adjusted for multiple. So we’ve taken the internal exposed filter URL, /analyst-relations?related_firm=5467 extracted the query and used it to set the $request variables query and then returned the string /analyst-relations. So, say someone has clicked on a custom URL like /analyst-relations/firm/gartner. Inside our custom processInbound, we’ve turned around and made it look to the backend like they clicked on /analyst-relations?related_firm=5467. Now the backend is going to process that custom URL exactly the way we want it to, but we also need to ensure that it sends back to the client our custom URL, /analyst-relations/firm/gartner which we’ll do in processOutbound.

Our processOutbound has one tricky piece: when a custom URL is clicked on a View we need to process both the URL that was clicked AND the other exposed filter links on the page that contain custom URLs. The beginning of our processOutbound will look like this:

public function processOutbound($path, &$options = [], Request $request = NULL, BubbleableMetadata $bubbleable_metadata = NULL) {
$has_fancy_url = $this->testPathForFancyURL($path);
if($has_fancy_url && array_key_exists('query', $options)) {

The key difference here is the $options variable. It’s a variable that gets populated as a Request moves its way through Symfony/Drupal’s many layers of middleware. Once it reaches our processOubound the $options array will have a ‘query’ key for our exposed filter links. So we narrow by detecting a fancy URL and the query key. Inside that if, we’re dealing with only exposed filter links for custom URLs, but now we need to differentiate between the custom URL we’re on and the ones for the other links on the page. So we do this:

public function processOutbound($path, &$options = [], Request $request = NULL, BubbleableMetadata $bubbleable_metadata = NULL) {
$has_fancy_url = $this->testPathForFancyURL($path);
if($has_fancy_url && array_key_exists('query', $options)) {
$has_options_query = !empty($options['query']);
$has_request_query = $request !== NULL && !empty($request->query->all());
if (!$has_options_query && $has_request_query ) {
//This narrows to just the exposed filter link that has been clicked
} else{
//This would be the other non-clicked exposed filter links.
//$has_request_query would be false and has_options_query would be true
}
}

We test if the $options[‘query’] value is empty, and if the request has any query parameters. The custom URL that has been clicked will not have a value in $options[‘query’], but it will have a set query in the $request variable, because we set it in our processInbound. If it has an $options[‘query’] we know that we’re dealing with one of the non-clicked exposed filter links on the page. Now we can look at that if/else a bit closer:

if (!$has_options_query && $has_request_query ) {
//This narrows to just the exposed filter link that has been clicked
$key = $this->tryToFindKey($path, $request->query->all());
if (!empty($key)) {
return $request->getPathInfo();
}
} else{
//This would be the other non-clicked exposed filter links.
//$has_request_query would be false and has_options_query would be true
}

For the clicked link, the key portion is the tryToFindKey function. Here, we match the query key that is sent across in the request (that we set in processInbound) with the query keys we hardcoded in our private variable:

private function tryToFindKey($path, $passed_key){
$paths_query_keys = $this->paths_and_query_keys[$path];
$key = '';
foreach ($paths_query_keys as $possible_key) {
//Despite looping here, this code is set to expect one key only.
//Will need to re-evaluate this loop if multiple keys are passed.
if (array_key_exists($possible_key, $passed_key)) {
//We've matched an outgoing fancy URL
$key = $possible_key;
break;
}
}
return $key;
}

Note the comment that this function is coded to expect only a single query key and would need to be modified to match multiple. Now we can look at the finished processOutbound

public function processOutbound($path, &$options = [], Request $request = NULL, BubbleableMetadata $bubbleable_metadata = NULL) {
$has_fancy_url = $this->testPathForFancyURL($path);
if($has_fancy_url && array_key_exists('query', $options)) {
$has_options_query = !empty($options['query']);
$has_request_query = $request !== NULL && !empty($request->query->all());
if (!$has_options_query && $has_request_query ) {
//This narrows to just the exposed filter link that has been clicked
$key = $this->tryToFindKey($path, $request->query->all());
if (!empty($key)) {
return $request->getPathInfo();
}
} else{
//This would be the other non-clicked exposed filter links.
//$has_request_query would be false and has_options_query would be true
}
}
return $path;
}

In short we narrow down to the custom URL that has been clicked, in which case we extract it from the $request->getPathInfo() method (the variable this gets contains the path that was clicked) and return it. Otherwise, if the currently processed path is any other path we can just return it normally.

And that’s it! If you’ve been with us through Part 1 and 2, this structure should now work for the custom URLs that have been hardcoded at various places throughout. I’ve mentioned a number of times that the way to generalize our custom URLs module to any View is to load the View specifically and extract its path/query key values instead of hardcoding.

With this structure, for any page load, only a few extra lines of code are executing. For the custom URL pages, we will only have to go to the database after every cache flush. While I don’t have any performance numbers, I expect this code to perform well.

If you have any questions you can find me @plamb on the Drupal slack chat.

--

--

Pierce Lamb

Data & Machine Learning Engineer at a Security startup