Htaccess [SOLVED]: Force site to HTTPS except for some pages and Facebook crawler

Htaccess [SOLVED]: Force site to HTTPS except for some pages and Facebook crawler

Home Forums htaccess Htaccess [SOLVED]: Force site to HTTPS except for some pages and Facebook crawler

Tagged: , , ,

Viewing 2 posts - 1 through 2 (of 2 total)
  • Author
    Posts
  • #36977

    Anonymous

    QuestionQuestion

    There are a few similar questions to this, but none really covered everything I need to do and I’m a bit over my head!
    I have an existing wordpress site. I want to force the home page and any new subpages to HTTPS but force existing subpages (about 20 of them) to HTTP. Reason being these subpages have long Facebook comment threads that I don’t want to lose, and the canonical workarounds only retain likes/shares, not comments. To retain likes/shares, the Facebook crawler needs to be able to access the HTTP version of the home page.

    So I need to work out the code for htaccess to enable:
    1. Force site generally to be HTTPS
    2. Force certain pages to be HTTP
    3. Allow the Facebook crawler to access the HTTP version of the home page (only).

    Any help greatly appreciated.
    EDIT added code I thought I’d try, but haven’t:

    RewriteEngine On 
    # Go to https for all but existing subpages
    RewriteCond %{SERVER_PORT} 80
    RewriteCond %{REQUEST_URI} !^ page1 | page2 | page3 $ [NC]
    RewriteRule ^(.*)$ https://www.example.com/$1 [R,L] 
    
    # Go to http for existing subpages 
    RewriteCond %{SERVER_PORT} !80 
    RewriteCond %{REQUEST_URI} ^ page1 | page2 | page3 $ [NC] 
    RewriteRule ^(.*)$ http://www.example.com/$1 [R,L]
    

    Not sure where to put the Facebook crawler exception, nor whether I have the correct syntax to exclude pages, bearing in mind it’s a wordpress site.

    #36978

    Anonymous

    Accepted AnswerAnswer

    You can check the facebook crawler user agent, which list here.

    # Go to http for home page if Facebook Crawler
    RewriteCond %{SERVER_PORT} !80
    RewriteCond %{HTTP_USER_AGENT} ^facebookexternalhit|Facebot
    RewriteRule ^$ http://www.example.com/ [R,L]
    
    RewriteCond %{HTTP_USER_AGENT} ^facebookexternalhit|Facebot
    RewriteRule ^$ - [L]
    
    # Go to https for all but existing subpages
    RewriteCond %{SERVER_PORT} 80
    RewriteCond %{REQUEST_URI} !^/(page1|page2|page3)$ [NC]
    RewriteRule ^(.*)$ https://www.example.com/$1 [R,L]
    
    # Go to http for existing subpages 
    RewriteCond %{SERVER_PORT} !80
    RewriteCond %{REQUEST_URI} ^/(page1|page2|page3)$ [NC]
    RewriteRule ^(.*)$ http://www.example.com/$1 [R,L]
    

    Source: https://stackoverflow.com/questions/48042280/force-site-to-https-except-for-some-pages-and-facebook-crawler
    Author: Ben
    Creative Commons License
    This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Viewing 2 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic.