Custom, SEO-friendly URLs for Drupal Exposed Filters Part 1

Pierce Lamb
8 min readMay 15, 2020

Part 1 covers when and how to generate custom URLs, Part 2 covers how to load and process these URLs.

If you have any questions you can find me @plamb on the Drupal slack chat.

A core feature of Drupal is the View. Views are pages that list content on a Drupal website. Despite that simple-sounding description, Views can be complex, especially for beginners. One feature we commonly want on a content listing page is the ability for an end user to filter the displayed content dynamically. For example, a car website might display all cars in a database on a listing page and an end-user might want to click an exposed filter called ‘Tesla’ to show only the Tesla models. Drupal provides this functionality out-of-the-box. Exposed filters in Drupal function by attaching querying parameters to the base URL of the View which the backend can use to appropriately filter the content. For example, if I have a View with the path /analyst-relations that displays content from large technology analysts, one exposed filter might be a link with the title Gartner. The path attached to the Gartner link will look like /analyst-relations?related_firm=5467. This query parameter, ?related_firm=5467, provides all the information Drupal needs to appropriately filter content. However, it is not a very nice-looking, descriptive URL. Ideally the link associated with the Gartner filter is something like /analyst-relations/firm/gartner.

I should note now that I am not a SEO expert and I don’t know for certain if custom exposed filter links will affect ranking in search engines. However, when I click a link like /analyst-relations/firm/gartner I have a much better idea of what information will be contained on that page than if I click /analyst-relations?related_firm=5467. Since serving these URLs does not have a high performance cost and they provide a more user-friendly experience, I believe that is reason enough to serve them.

Our goal is to replace all default exposed filter links with custom, descriptive URLs. The first question is, how do we create the custom URLs programmatically? Each URL will need to be unique and based on the content(s) it is related to. One option would be to do this dynamically as a page with exposed filter links is being loaded. Another option is to generate and store the custom URL whenever the relevant content is created/updated/deleted. I preferred the second option as it feels safer, more performant, and Drupal 8/9 comes with the path_alias module which I believe fits this task. I’ll note that this decision is definitely up for debate.

Okay, so we’re going to generate these custom URLs at CRUD time for relevant content(s). The quickest way to do that is, in a custom module, utilizing hook_entity_insert, hook_entity_update, and hook_entity_delete. From a technical debt perspective there may be a better way to do this, for e.g. by extending Entity classes, but these hooks will get you to a proof-of-concept the quickest. Every time any Entity is created, updated or deleted, these hooks are going to fire. If our custom module is called custom_urls, in our custom_urls.module file we would have:

/**
* Implements hook_entity_insert()
*/
function custom_urls_entity_insert(Drupal\Core\Entity\EntityInterface $entity){
_create_or_update_path_alias($entity);
}
/**
* Implements hook_entity_update()
*/
function custom_urls_entity_update(Drupal\Core\Entity\EntityInterface $entity){
_create_or_update_path_alias($entity);
}
/**
* Implements hook_entity_delete()
*/
function custom_urls_entity_delete(Drupal\Core\Entity\EntityInterface $entity){
_delete_path_alias($entity)
}

Inside of _create_or_update_path_alias and _delete_path_alias the first thing we’ll do is narrow down to only the entities we care about. That function will be called: _is_relevant_entity. Exposed Filters are often based on TaxonomyTerms or specific Entity bundles. For our example, inside _is_relevant_entity we will narrow to only the Terms and Entity Bundle we care about:

function _is_relevant_entity(Drupal\Core\Entity\EntityInterface $entity){
$entity_arr = [
'boolean' => FALSE,
'old_path' => '',
'new_path' => ''
];
$maybe_term = $entity instanceof Drupal\taxonomy\Entity\Term;
if($maybe_term){
...
} elseif ($entity->bundle() == 'product') {
....
}
return $entity_arr;
}

$entity_arr will be used to carry information about if the Entity is relevant, what the generated exposed filter path is and what the custom URL will be. If you follow the control structure you can see we’re going to use it to determine what the boolean value should be and for our example we care about Terms and Entities of type product. In our proof-of-concept, it would look something like this:

function _is_relevant_entity(Drupal\Core\Entity\EntityInterface $entity){
$entity_arr = [
'boolean' => FALSE,
'old_path' => '/analyst-relations',
'new_path' => ''
];
$maybe_term = $entity instanceof Drupal\taxonomy\Entity\Term;
if($maybe_term){
$relevant_taxonomies = [
'related_topics' => '/topic?related_topic=',
'related_companies' => '?related_firm='
];
$taxonomy_name = $entity->bundle();
$entity_arr['boolean'] = in_array($taxonomy_name, array_keys($relevant_taxonomies));
$entity_arr['old_path'] = $entity_arr['old_path'].$relevant_taxonomies[$taxonomy_name].$entity->id();
} elseif ($entity->bundle() == 'product') {
$entity_arr['boolean'] = TRUE;
$entity_arr['old_path'] = $entity_arr['old_path'].'/product?related_product='.$entity->id();
}
return $entity_arr;
}

As you can see, to get to a POC, I’ve done a lot of hardcoding here. In a fully general solution and safer solution, we’d load the View and get the old_path and the values in $relevant_taxonomies that way. However, via hardcoding I’ve generated the exact same paths that the View will create, for e.g. /analyst-relations?related_firm=5467. Note that if you don’t generalize this and the query keys or path in the View change (they are customizable) this will stop working.

Okay, so back to our _create_or_update_path_alias function. The beginning will look something like this:

function _create_or_update_path_alias($entity){
$raw_entity_arr = _is_relevant_entity($entity);
if($raw_entity_arr['boolean']){
//Update the path alias with the new URL
$clean_entity_arr = _build_custom_url($entity, $raw_entity_arr);

We use the boolean key to make sure we have an Entity we care about. Next we generate the custom url in _build_custom_url. That function will look like this:

function _get_url_from_regex($title){
$replace_whitespace = preg_replace('/\s+/', '-',$title);
$new_path_caboose = preg_replace('/[^a-zA-Z.-]/','',$replace_whitespace);
return $new_path_caboose;
}
function _build_custom_url($entity, $entity_arr){
$maybe_product = $entity->bundle() == 'product';
$raw_entity_url = $entity->url();
$entity_url_arr = explode('/',$raw_entity_url);
if($maybe_product) {
//It's a product Node
$new_path_train = '/analyst-relations/product/';
if(array_key_exists(2, $entity_url_arr)){
$new_path_caboose = $entity_url_arr[2];
} else {
$new_path_caboose = _get_url_from_regex($entity->label());
}
} else {
//It's a taxonomy term
$old_path = $entity_arr['old_path'];
$maybe_firm = strpos($old_path,'firm') !== FALSE;
if($maybe_firm){
//Firm filter
$new_path_train = '/analyst-relations/firm/';
} else {
//Topic filter
$new_path_train = '/analyst-relations/topic/';
}
if(count($entity_url_arr) > 1 && $entity_url_arr[1] !== 'taxonomy'){
$new_path_caboose = $entity_url_arr[1];
} else {
$new_path_caboose = _get_url_from_regex($entity->label());
}
}
$new_path = $new_path_train.strtolower($new_path_caboose);
$entity_arr['new_path'] = $new_path;
return $entity_arr;
}

In this function, we attempt to create the custom URL from the $entity->url() attached to Product’s and Taxonomy Terms. If we’re unable to, we pass the $entity-label() through some regexs. I’ve split the regexs into two inside _get_url_from_regex to make it easier to understand what is going on. We take the Entities label and replace any whitespace in it with a dash. We take this string and remove any non alphabetic character out of it. This produces strings that should work as the end (the caboose) of the new path and they replace the id number out of the old path. Then, whether we have a product or appropriate taxonomy term, we create the first part (the train) of the custom URL. Again this has been hardcoded for alacrity, but like above, in a general solution we’d load the View and create these. Like above if the View’s path changes this will stop working.

Okay so now we have an array that verifies we have a correct Entity, its old exposed filter path and the new custom path we want it to have. Now we are going to use the entityTypeManager() to query the path_alias table. Let’s view some more of the _create_or_update_path_alias function:

function _create_or_update_path_alias($entity){
$raw_entity_arr = _is_relevant_entity($entity);
if($raw_entity_arr['boolean']){
//Update the path alias with the new URL
$clean_entity_arr = _build_custom_url($entity, $raw_entity_arr);
$old_path = $clean_entity_arr['old_path'];
$new_path = $clean_entity_arr['new_path'];
$path_alias_conn = \Drupal::entityTypeManager()->getStorage('path_alias');
$new_path_already_exists = $path_alias_conn->loadByProperties(['alias' => $new_path]);
if(empty($new_path_already_exists)) {
$maybe_path_alias = $path_alias_conn->loadByProperties(['path' => $old_path]);
if (empty($maybe_path_alias)) {
//Create path alias
} else if (count($maybe_path_alias) == 1) {
//Update path alias
} else {
//We've somehow returned more than one result for the old path. Something is wrong
\Drupal::logger('custom_urls')->notice("The path: " . $old_path . ", is returning more than one result in path_alias");
}
} else {
\Drupal::logger('custom_urls')->notice("The generated path: " . $new_path . ", already exists in path_alias. An entity with an identical title was likely created");
}
}
}

So we get the connection to the path_alias table. First we test if the $new_path (the custom URL) already exists there. If it does we don’t do anything and send a message to the logger so we’re aware that the current Entity is trying to create a custom URL that already exists. Then we check if the $old_path (the generated exposed filter path) is already in the path_alias table (note that because they contain the entity’s id, they should only ever conflict on the rare chance, say, a Node and a Term on the same view have the same ID). If it does not we create the new path_alias entry using the $old_path and $new_path; else if it comes back with 1 result than we have an update and we update the $new_path; else we’ve somehow returned more than one result for the $old_path and we notify the logger. Here is the function completely filled out:

function _create_or_update_path_alias($entity){
$raw_entity_arr = _is_relevant_entity($entity);
if($raw_entity_arr['boolean']){
//Update the path alias with the new URL
$clean_entity_arr = _build_custom_url($entity, $raw_entity_arr);
$old_path = $clean_entity_arr['old_path'];
$new_path = $clean_entity_arr['new_path'];
$path_alias_conn = \Drupal::entityTypeManager()->getStorage('path_alias');
$new_path_already_exists = $path_alias_conn->loadByProperties(['alias' => $new_path]);
if(empty($new_path_already_exists)) {
$maybe_path_alias = $path_alias_conn->loadByProperties(['path' => $old_path]);
if (empty($maybe_path_alias)) {
//Create path alias
$new_path_ent = $path_alias_conn->create([
'path' => $old_path,
'alias' => $new_path,
'langcode' => \Drupal::languageManager()->getCurrentLanguage()->getId()
]);
$new_path_ent->save();
//Add new URL to cache
_cache_fancy_url($old_path,$new_path);
} else if (count($maybe_path_alias) == 1) {
//Update path alias
$path_alias_obj = reset($maybe_path_alias);
$path_alias_obj->set('alias', $new_path);
$path_alias_obj->save();
//Drop old URL from cache and add new one
_cache_fancy_url($old_path,$new_path);
} else {
//We've somehow returned more than one result for the old path. Something is wrong
\Drupal::logger('custom_urls')->notice("The path: " . $old_path . ", is returning more than one result in path_alias");
}
} else {
\Drupal::logger('custom_urls')->notice("The generated path: " . $new_path . ", already exists in path_alias. An entity with an identical title was likely created");
}
}
}

But wait, another function snuck in there: _cache_fancy_url($old_path,$new_path). In Part 2 of this series, we will look at how to load and process the custom urls; doing this from the cache is definitely the fastest way to do that, so we create/modify cache entries here. For clarity, I will show that function here:

function _cache_fancy_url($old_path, $new_path){
$default_cache = \Drupal::cache();
$old_path_result = $default_cache->get($old_path);
if($old_path_result !== FALSE) {
//Old path in cache, likely a Term or Product has been modified
//delete the old entry
$default_cache->delete($old_path);
}
//Add the new entry $default_cache->set($old_path,$new_path,Drupal\Core\Cache\CacheBackendInterface::CACHE_PERMANENT);
}

Caching here isn’t too important because when we load the custom urls, if they aren’t in the cache (perhaps, after a cache flush) we will set them there, but for what extra performance a 5 line function imparts it is worth it.

The delete hook is highly similar to the first two. I’ll paste it here and I imagine if you’ve read the above not much explanation is needed:

function custom_urls_entity_delete(Drupal\Core\Entity\EntityInterface $entity){
$raw_entity_arr = _is_relevant_entity($entity);
if($raw_entity_arr['boolean']){
//delete the associated path alias
$old_path = $raw_entity_arr['old_path'];
$path_alias_conn = \Drupal::entityTypeManager()->getStorage('path_alias');
$maybe_path_alias = $path_alias_conn->loadByProperties(['path' => $old_path]);
if(count($maybe_path_alias) == 1){
$path_alias_conn->delete($maybe_path_alias);
_delete_from_cache($maybe_path_alias);
} else {
\Drupal::logger('custom_urls')
->notice("The path: ".$old_path.", was set to delete from path_alias, but it returned ".count($maybe_path_alias)." results");
}
}
}

So now every time an Entity we care about in our View with exposed filters is created/updated/deleted we are also creating/updating/deleting and caching its associated custom URL. I prefer this way of creating the custom URLs versus creating them dynamically when the page loads as I feel that executing this extra code at entity CRUD time is more performant than at page load. While I know path_alias was intended for URLs like /node/1, I feel that this usage of the path_alias table matches its general intention: to provide nice aliases for non-nice paths.

We are one big step closer to custom URLs on a View with exposed filters, check out Part 2 to see how to load and process these custom URLs.

--

--

Pierce Lamb

Data & Machine Learning Engineer at a Security startup