for building a clean canonical url, that always returns 1 base URL, im stuck in following case:
<?php
# every page
$extensions = $_SERVER['REQUEST_URI']; # path like: /en/home.ast?ln=ja
$qsIndex = strpos($extensions, '?'); # removes the ?ln=de part
$pageclean = $qsIndex !== FALSE ? substr($extensions, 0, $qsIndex) : $extensions;
$canonical = "http://website.com" . $pageclean; # basic canonical url
?>
<html><head><link rel="canonical" href="<?=$canonical?>"></head>
when URL : http://website.com/de/home.ext?ln=de
http://website.com/de/home.ext
BUT I want to remove the file extension aswell, whether its .p开发者_开发百科hp, .ext .inc or whatever two or three char extension .[xx]
or .[xxx]
so the base url becomes: http://website.com/en/home
Aaah much nicer! but How do i achieve that in current code? Any hints are much appreciated +!
Think this should do it, just strip off the end if there is an extension, just like you did for the query string:
$pageclean = $qsIndex !== FALSE ? substr($extensions, 0, $qsIndex) : $extensions;
$dotIndex = strrpos($pageclean, '.');
$pagecleanNoExt = $dotIndex !== FALSE ? substr($pageclean, 0, $dotIndex) : $pageclean;
$canonical = "http://website.com" . $pagecleanNoExt; # basic canonical url
try this:
preg_match("/(.*)\.([^\?]{2,3})(\?(.*)){0,1}$/msiU", $_SERVER['REQUEST_URI'], $res);
$canonical = "http://website.com" . $res[1];
and $res[1] => clean url; $res[2] = extension; $res[4] = everything after the "?" (if present and if you need it)
精彩评论