Submission
Due Date
By Friday 8 February 2019 23:42
Directory Hierarchy
Create your git repository (replace john.smith by your own login).
$ git clone git@git.cri.epita.net:p/2022-s4-tp/tp01-john.smith
It must contain the following files and directories:
-
pw_01_tcp_client/
- AUTHORS
- main.c
- print_page.c
- print_page.h
- Makefile
The AUTHORS
file
must contain the following information.
First Name
Family Name
Login
Email Address
The last character of your AUTHORS
file must be a newline character.
For instance:
$ cat AUTHORS
John
Smith
john.smith
john.smith@epita.fr
$ # Command prompt ready for the next command...
Be careful, if you do not follow all the given instructions, no point will be given to your answers.
Introduction
In this practical, you will design a client that connects to a web server. This web server should be able to handle unencrypted HTTP request methods. We will use the GET method only.
As an example, we will connect to the server of the example.com domain.
We can use the nc
command to get the contents of a web page.
$ nc example.com 80
GET http://www.example.com HTTP/1.0
HTTP/1.0 200 OK
Accept-Ranges: bytes
Cache-Control: max-age=604800
Content-Type: text/html; charset=UTF-8
Date: Thu, 27 Dec 2018 21:54:48 GMT
Etag: "1541025663+ident"
Expires: Thu, 03 Jan 2019 21:54:48 GMT
Last-Modified: Fri, 09 Aug 2013 23:54:35 GMT
Server: ECS (dca/2468)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 1270
Connection: close
<!doctype html>
<html>
<head>
<title>Example Domain</title>
<meta charset="utf-8" />
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<style type="text/css">
body {
background-color: #f0f0f2;
margin: 0;
padding: 0;
font-family: "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
}
div {
width: 600px;
margin: 5em auto;
padding: 50px;
background-color: #fff;
border-radius: 1em;
}
a:link, a:visited {
color: #38488f;
text-decoration: none;
}
@media (max-width: 700px) {
body {
background-color: #fff;
}
div {
width: auto;
margin: 0 auto;
border-radius: 0;
padding: 1em;
}
}
</style>
</head>
<body>
<div>
<h1>Example Domain</h1>
<p>This domain is established to be used for illustrative examples in documents. You may use this
domain in examples without prior coordination or asking for permission.</p>
<p><a href="http://www.iana.org/domains/example">More information...</a></p>
</div>
</body>
</html>
The line 1 is the command. The first argument is the host name, which is the domain name assigned to a host computer. The second argument is the port number, which is 80 for HTTP servers.
Then, on line 2, we can find the GET
request method.
This method is used to get a resource.
Its first argument is the URL of the resource we want to get.
The second argument is the version of the protocol (HTTP/1.0).
The method is then followed by two sequences of newline characters: \r\n\r\n
.
To do so, you have to press the Enter key twice.
From line 4 to the end, we have the response of the GET
method
that was sent by the server.
The first part of the response (from line 4 to line 16) is a header. It contains some information about the response.
The second part of the response (from line 18 to line 67) is the default page that is returned by the server for the specified URL (http://www.example.com).
Provided Files
The print_page.h
File
#ifndef GET_PAGE_H
#define GET_PAGE_H
void print_page(const char *host);
#endif
The print_page.c
File
#define _GNU_SOURCE
#include <netdb.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <err.h>
#include <unistd.h>
char *build_query(const char *host, size_t *len)
{
// TODO
}
void print_page(const char *host)
{
// TODO
}
Implementation
Generating the HTTP Request
We want to connect to a web server and print its default page.
Your program will take one argument only: the host name.
For instance, the result of the following command should be
similar to that of the nc
command in the
above example.
$ ./main example.com
HTTP/1.0 200 OK
Cache-Control: max-age=604800
Content-Type: text/html; charset=UTF-8
Date: Thu, 27 Dec 2018 23:14:59 GMT
Etag: "1541025663+ident"
Expires: Thu, 03 Jan 2019 23:14:59 GMT
Last-Modified: Fri, 09 Aug 2013 23:54:35 GMT
Server: ECS (dca/5327)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 1270
Connection: close
<!doctype html>
<html>
<head>
<title>Example Domain</title>
<meta charset="utf-8" />
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<style type="text/css">
body {
background-color: #f0f0f2;
margin: 0;
padding: 0;
font-family: "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
}
div {
width: 600px;
margin: 5em auto;
padding: 50px;
background-color: #fff;
border-radius: 1em;
}
a:link, a:visited {
color: #38488f;
text-decoration: none;
}
@media (max-width: 700px) {
body {
background-color: #fff;
}
div {
width: auto;
margin: 0 auto;
border-radius: 0;
padding: 1em;
}
}
</style>
</head>
<body>
<div>
<h1>Example Domain</h1>
<p>This domain is established to be used for illustrative examples in documents. You may use this
domain in examples without prior coordination or asking for permission.</p>
<p><a href="http://www.iana.org/domains/example">More information...</a></p>
</div>
</body>
</html>
The first thing to do is to generate the request.
That is the purpose of the build_query()
function.
char *build_query(const char *host, size_t *len);
-
Arguments:
- host: the host name.
- len: a pointer to the length of the request. This argument must return the length of the request that will be generated.
-
Return Value:
the request that has been generated.
This string of characters must be dynamically generated and will be freed by the caller.
-
Note:
-
For instance, if the host name is
"example.com"
, the generated request should be:"GET http://www.example.com HTTP/1.0\r\n\r\n"
- If the request cannot be generated, the program should exit immediately with an error message.
- See asprintf(3).
-
For instance, if the host name is
The print_page() Function
Connecting to the Host
Only the name of the host is passed as an argument to the
print_page()
function (e.g. "example.com").
void print_page(const char *host);
In order to connect to the host, you will use the getaddrinfo(3) function. The hints argument will point to an addrinfo structure. Only two fields of this structure should not be null:
- ai_family should be equal to AF_INET.
- ai_socktype should be equal to SOCK_STREAM.
To initialize the other fields to zero, you should use memset(3).
The addrinfo()
function returns a linked list.
You should iterate through this list and try (for each element)
to create a socket and connect it to the host.
When a connection for at least one element of the list is successful,
you can break the loop and free the list.
If no connection has succeeded, your program must exit immediately with an error message.
To create and connect a socket, you should use the
socket(2)
and
connect(2)
functions.
Sending the Request and Waiting for the Response
When a connection is successful,
you can use the file descriptor returned by the socket()
function to communicate with the host.
You can use the write(2)
function to send the request to the host and the
read(2)
function to read its response.
Finally, print the response of the host to the standard output and do not forget to close the socket (see close(2)).
The Main Function
Your main function must accept only one command-line argument, which is the host name.
If the number of arguments is different from one,
your program must exit immediately with an error message.
Your main function should call the print_page()
function with the host name.
Also, you must provide a simple Makefile in order to compile your code.
Here is another example with the perdu.com host name:
$ ./main perdu.com
HTTP/1.1 200 OK
Date: Fri, 28 Dec 2018 21:42:22 GMT
Server: Apache
Last-Modified: Thu, 02 Jun 2016 06:01:08 GMT
ETag: "cc-5344555136fe9"
Accept-Ranges: bytes
Content-Length: 204
Vary: Accept-Encoding
Connection: close
Content-Type: text/html
<html><head><title>Vous Etes Perdu ?</title></head><body><h1>Perdu sur l'Internet ?</h1><h2>Pas de panique, on va vous aider</h2><strong><pre> * <----- vous êtes ici</pre></strong></body></html>