Is there an HTTP header to indicate which base URL to use for relative links?

I retrieve the page from another host and then initialize the form with data from the database before sending it to the user.

I need to make the URLs in href and src attributes absolute so that browsers download them from the right place.

Is it possible to set an HTTP header so that this happens without changing the HTML?

+4
source share
3 answers

This does not exist for HTTP. But you can set the base URL of the HTML BASE element , for example:

 <base href="http://example.com/"> 
+9
source

Not. The only way to do this is with the <base> element in the HTML output.

See docs here: HTML <base> Tag

Alternative idea

if you can't touch HTML, you can put something together using mod_rewrite . You would create a 301 redirect statement for your image resources that would point to a remote server. The only condition for this is that your image requests follow a fixed pattern (e.g. /images/xyz.jpg ), which you can translate into a RewriteRule .

This tutorial will open to get you started.

+4
source

Per HTML and W3C URLs :

User agents must calculate the base URL to resolve relative URLs in accordance with [RFC1808] . The following is a brief description of how [RFC1808] relates to HTML. User agents must calculate the base URL in accordance with the following priorities (highest priority to lowest):

  • The base URL is set by the BASE element.
  • The base URL is set by the HTTP header (see [RFC2068] ) .
  • By default, the base URL refers to the current document.

In addition, OBJECT and APPLET define attributes that take precedence over the value set by the BASE element. Consult the definitions of these elements for more information about their specific URL problems.

RFC 2068 is the original specification for HTTP 1.1. He defined Content-Base and Content-Location headers to specify the base URL of the entity used to resolve relative URLs within the object:

  14.11 Content-Base

    The Content-Base entity-header field may be used to specify the base
    URI for resolving relative URLs within the entity.  This header field
    is described as Base in RFC 1808, which is expected to be revised.

           Content-Base = "Content-Base" ":" absoluteURI

    If no Content-Base field is present, the base URI of an entity is
    defined either by its Content-Location (if that Content-Location URI
    is an absolute URI) or the URI used to initiate the request, in that
    order of precedence.  Note, however, that the base URI of the contents
    within the entity-body may be redefined within that entity-body.
  14.15 Content-Location

    The Content-Location entity-header field may be used to supply the
    resource location for the entity enclosed in the message.  In the case
    where a resource has multiple entities associated with it, and those
    entities actually have separate locations by which they might be
    individually accessed, the server should provide a Content-Location
    for the particular variant which is returned.  In addition, a server
    SHOULD provide a Content-Location for the resource corresponding to
    the response entity.

           Content-Location = "Content-Location" ":"
                             (absoluteURI | relativeURI)

    If no Content-Base header field is present, the value of Content-
    Location also defines the base URL for the entity (see section
    14.11).

    The Content-Location value is not a replacement for the original
    requested URI;  it is only a statement of the location of the resource
    corresponding to this particular entity at the time of the request.
    Future requests MAY use the Content-Location URI if the desire is to
    identify the source of that particular entity.

    A cache cannot assume that an entity with a Content-Location
    different from the URI used to retrieve it can be used to respond to
    later requests on that Content-Location URI.  However, the Content-
    Location can be used to differentiate between multiple entities
    retrieved from a single requested resource, as described in section
    13.6.

    If the Content-Location is a relative URI, the URI is interpreted
    relative to any Content-Base URI provided in the response.  If no
    Content-Base is provided, the relative URI is interpreted relative to
    the Request-URI.

RFC 2068 is deprecated, replaced by RFC 2616 , which is currently the most common HTTP 1.1 specification implemented by most web servers. It completely removes the Content-Base header from the HTTP 1.1 specification and slightly redefines the semantics of Content-Location :

  14.14 Content-Location

    The Content-Location entity-header field MAY be used to supply the
    resource location for the entity enclosed in the message when that
    entity is accessible from a location separate from the requested
    resource URI.  A server SHOULD provide a Content-Location for the
    variant corresponding to the response entity;  especially in the case
    where a resource has multiple entities associated with it, and those
    entities actually have separate locations by which they might be
    individually accessed, the server SHOULD provide a Content-Location
    for the particular variant which is returned.

        Content-Location = "Content-Location" ":"
                          (absoluteURI | relativeURI)

    The value of Content-Location also defines the base URI for the
    entity.

    The Content-Location value is not a replacement for the original
    requested URI;  it is only a statement of the location of the resource
    corresponding to this particular entity at the time of the request.
    Future requests MAY specify the Content-Location URI as the request-
    URI if the desire is to identify the source of that particular
    entity.

    A cache cannot assume that an entity with a Content-Location
    different from the URI used to retrieve it can be used to respond to
    later requests on that Content-Location URI.  However, the Content-
    Location can be used to differentiate between multiple entities
    retrieved from a single requested resource, as described in section
    13.6.

    If the Content-Location is a relative URI, the relative URI is
    interpreted relative to the Request-URI.

    The meaning of the Content-Location header in PUT or POST requests is
    undefined  servers are free to ignore it in those cases.

It is important to note that the "Content-Location value also defines the base URI for the object", which is still applied at this point.

Moving forward, RFC 2616 was deprecated by RFC 7230-7235 (which have not yet been widely adopted). In particular, RFC 7231 completely redefines the semantics of Content-Location :

  3.1.4.2.  Content location

    The "Content-Location" header field references a URI that can be used
    as an identifier for a specific resource corresponding to the
    representation in this message payload.  In other words, if one
    were to perform a GET request on this URI at the time of this
    message generation, then a 200 (OK) response would contain the same
    representation that is enclosed as payload in this message.

      Content-Location = absolute-URI / partial-URI

    The Content-Location value is not a replacement for the effective
    Request URI (Section 5.5 of [RFC7230]).  It is representation
    metadata.  It has the same syntax and semantics as the header field
    of the same name defined for MIME body parts in Section 4 of
    [RFC2557].  However, its appearance in an HTTP message has some
    special implications for HTTP recipients.

    If Content-Location is included in a 2xx (Successful) response
    message and its value refers (after conversion to absolute form) to a
    URI that is the same as the effective request URI, then the recipient
    MAY consider the payload to be a current representation of that
    resource at the time indicated by the message origination date.  For
    a GET (Section 4.3.1) or HEAD (Section 4.3.2) request, this is the
    same as the default semantics when no Content-Location is provided by
    the server.  For a state-changing request like PUT (Section 4.3.4) or
    POST (Section 4.3.3), it implies that the server response contains
    the new representation of that resource, thereby distinguishing it
    from representations that might only report about the action (eg,
    "It worked!").  This allows authoring applications to update their
    local copies without the need for a subsequent GET request.

    If Content-Location is included in a 2xx (Successful) response
    message and its field-value refers to a URI that differs from the
    effective request URI, then the origin server claims that the URI is
    an identifier for a different resource corresponding to the enclosed
    representation.  Such a claim can only be trusted if both identifiers
    share the same resource owner, which cannot be programmatically
    determined via HTTP.

    o For a response to a GET or HEAD request, this is an indication
       that the effective request URI refers to a resource that is
       subject to content negotiation and the Content-Location
       field-value is a more specific identifier for the selected
       representation.

    o For a 201 (Created) response to a state-changing method, a
       Content-Location field-value that is identical to the Location
       field-value indicates that this payload is a current
       representation of the newly created resource.

    o Otherwise, such a Content-Location indicates that this payload is
       a representation reporting on the requested action status and
       that the same report is available (for future access with GET) at
       the given URI.  For example, a purchase transaction made via a
       POST request might include a receipt document as the payload of
       the 200 (OK) response;  the Content-Location field-value provides
       an identifier for retrieving a copy of that same receipt in the
       future.

    A user agent that sends Content-Location in a request message is
    stating that its value refers to where the user agent originally
    obtained the content of the enclosed representation (prior to any
    modifications made by that user agent).  In other words, the user
    agent is providing a back link to the source of the original
    representation.

    An origin server that receives a Content-Location field in a request
    message MUST treat the information as transitory request context
    rather than as metadata to be saved verbatim as part of the
    representation.  An origin server MAY use that context to guide in
    processing the request or to save it for other uses, such as within
    source links or versioning metadata.  However, an origin server MUST
    NOT use such context information to alter the request semantics.

    For example, if a client makes a PUT request on a negotiated resource
    and the origin server accepts that PUT (without redirection), then
    the new state of that resource is expected to be consistent with the
    one representation supplied in that PUT;  the Content-Location cannot
    be used as a form of reverse content selection identifier to update
    only one of the negotiated representations.  If the user agent had
    wanted the latter semantics, it would have applied the PUT directly
    to the Content-Location URI.

Most importantly, RFC 7231 also indicates :

  Appendix B. Changes from RFC 2616

    ...

    The definition of Content-Location has been changed to no longer
    affect the base URI for resolving relative URI references, due to
    poor implementation support and the undesirable effect of potentially
    breaking relative links in content-negotiated resources.
    (Section 3.1.4.2)

    ...

So, in response to the question that was asked:

  • according to RFC 2616, the answer is YES , Content-Location exists to indicate the base URL of an entity at the HTTP level.

  • like RFC 7231, the answer is NO , Content-Location can no longer be used to specify the base URL of an entity.

AFAIK, with RFC 7231, a new or existing HTTP header was not created to restore the behavior of the base URL. Therefore, the HTTP header is not available for the base URL. It can only be specified by the object itself if it should be different from the request URL of the object.

+3
source

Source: https://habr.com/ru/post/1299687/


All Articles