[147] | 1 | \documentclass{article}
|
---|
| 2 | \title{Database Connection for the CPS Facilitator Tool}
|
---|
| 3 | \author{Jos Kraaijeveld}
|
---|
| 4 | \date{\today}
|
---|
| 5 | \usepackage{moreverb}
|
---|
| 6 | \usepackage{fullpage}
|
---|
| 7 | \usepackage{framed}
|
---|
| 8 | \usepackage{float}
|
---|
| 9 | \usepackage{multirow}
|
---|
[160] | 10 | \usepackage[final]{pdfpages}
|
---|
| 11 | \usepackage{hyperref}
|
---|
[147] | 12 | \begin{document}
|
---|
| 13 |
|
---|
| 14 | \maketitle
|
---|
| 15 |
|
---|
| 16 | \pagebreak
|
---|
| 17 |
|
---|
| 18 | \tableofcontents
|
---|
| 19 |
|
---|
| 20 | \pagebreak
|
---|
| 21 |
|
---|
| 22 | \section{Introduction}
|
---|
| 23 | This document describes the database design and implementation for the CPS Facilitator Tool. The goal is to make
|
---|
| 24 | sure everyone can communicate with the RDF-based database without having to write queries. The DBInterface is
|
---|
| 25 | written in PHP and requires slight PHP knowledge before use. It is heavily based on initial work by Bas van Nuland. \\
|
---|
| 26 |
|
---|
| 27 | If you use this framework, you can skip to the `How to use'-part of this document (\ref{howtouse}). If you are
|
---|
| 28 | planning to improve this framework, I recommend reading the RDF Primer\footnote{http://www.w3.org/TR/rdf-primer/} and
|
---|
| 29 | SPARQL Query Language for RDF\footnote{http://www.w3.org/TR/rdf-sparql-query/}. A full specification of the current
|
---|
| 30 | implementation can be found in section \ref{spec}. The areas I suggest improving upon first are described in section
|
---|
| 31 | \ref{futurework}.
|
---|
| 32 |
|
---|
| 33 | \section{How to use}\label{howtouse}
|
---|
| 34 | \subsection{Initialization}
|
---|
| 35 | Before the database can be used, you should initialize a \textit{DatabaseInterface} object. This is the only
|
---|
| 36 | database class you will use. This is done calling the DatabaseInterface constructor.
|
---|
| 37 | \begin{verbatimtab}
|
---|
| 38 | $db = new DatabaseInterface();
|
---|
| 39 | \end{verbatimtab}
|
---|
| 40 |
|
---|
| 41 | \subsection{Basic queries}
|
---|
| 42 | \subsubsection{Retrieving data}
|
---|
| 43 | Getting data is done by calling the get method of the DatabaseInterface class. The specification for this
|
---|
| 44 | method is as follows. \\ \\
|
---|
| 45 | \begin{tabular}{| l || r |} \hline
|
---|
| 46 | \multicolumn{2}{|c|}{Function get()} \\ \hline
|
---|
| 47 | First argument: \$type & String \\ \hline
|
---|
| 48 | Second argument: \$arguments & Array \\ \hline
|
---|
| 49 | Return type: & Array/Variable \\ \hline
|
---|
| 50 | \end{tabular} \\ \\
|
---|
| 51 |
|
---|
| 52 | The return type is an Array of objects with the type specified as the first argument. For instance, if
|
---|
| 53 | the first argument is ``user", the function will return an array of User objects. The second argument can be
|
---|
| 54 | left out if you want to retrieve all entries of a specific type. Below are a few examples showing the usage
|
---|
| 55 | of get(). A description of what arguments are possible per type is given in section \ref{arguments}.
|
---|
| 56 |
|
---|
| 57 | \begin{figure}[H]
|
---|
| 58 | \caption{Retrieving all questions and echo their title.}
|
---|
| 59 | \begin{framed}
|
---|
| 60 | \begin{verbatimtab}
|
---|
| 61 | $questions = $db->get("question");
|
---|
| 62 | foreach ($questions as $question)
|
---|
| 63 | {
|
---|
| 64 | echo $question->title;
|
---|
| 65 | }
|
---|
| 66 | \end{verbatimtab}
|
---|
| 67 | \end{framed}
|
---|
| 68 | \end{figure}
|
---|
| 69 |
|
---|
| 70 | \begin{figure}[H]
|
---|
| 71 | \caption{Retrieving all surveys containing questions with IDs q1 and q2, created by the user
|
---|
| 72 | Jos, printing all questions in those surveys}
|
---|
| 73 | \begin{framed}
|
---|
| 74 | \begin{verbatimtab}
|
---|
| 75 | //To get all surveys created by the user Jos, we need his UID.
|
---|
| 76 | //We first query the database for the User object belonging to Jos.
|
---|
| 77 | $userResults = $db->get("user", array("name" => "Jos"));
|
---|
| 78 | //Assuming there is a result, the UID is:
|
---|
| 79 | $josUID = $userResults[0]->uid;
|
---|
| 80 | //Now to get the requested surveys:
|
---|
| 81 | $surveys = $db->get("survey", array("questions" => array("q1", "q2"),
|
---|
| 82 | "creator" => $josUID));
|
---|
| 83 | //And to print all the questions in these surveys:
|
---|
| 84 | foreach($surveys as $survey)
|
---|
| 85 | {
|
---|
| 86 | echo "All the questions in " . $survey->name;
|
---|
| 87 | foreach($survey->questions as $question)
|
---|
| 88 | {
|
---|
| 89 | print_r($question);
|
---|
| 90 | }
|
---|
| 91 | }
|
---|
| 92 | \end{verbatimtab}
|
---|
| 93 | \end{framed}
|
---|
| 94 | \end{figure}
|
---|
| 95 |
|
---|
| 96 | \subsubsection{Storing data}
|
---|
| 97 | Storing data is easy, as long as you stick to using the given PHP Classes. Retrieve these classes by using
|
---|
| 98 | the get() function, and pass them as a parameter to the set() function of the DatabaseInterface. This set()
|
---|
| 99 | function will determine what type the object is and store it at the correct location. The method overview is
|
---|
| 100 | as follows: \\ \\
|
---|
| 101 | \begin{tabular}{| l || r |} \hline
|
---|
| 102 | \multicolumn{2}{|c|}{Function set()} \\ \hline
|
---|
| 103 | First argument: \$rToolObject & Variable \\ \hline
|
---|
| 104 | \end{tabular} \\ \\
|
---|
| 105 |
|
---|
| 106 | Another important thing to note is that currently it will overwrite a previous object with the same UID in
|
---|
| 107 | the database. The following examples show how to create new objects and save them, as well as edit old
|
---|
| 108 | objects and save them.
|
---|
| 109 |
|
---|
| 110 | \begin{figure}[H]
|
---|
| 111 | \caption{Creating a new Answer object and storing this in the database.}
|
---|
| 112 | \begin{framed}
|
---|
| 113 | \begin{verbatimtab}
|
---|
| 114 | //For this example, I choose a random question to answer
|
---|
| 115 | $questions = $db->get("question");
|
---|
| 116 | $question = $questions[2];
|
---|
| 117 | //Note two things:
|
---|
| 118 | //1 - If you pass 'null' as first argument when creating any object
|
---|
| 119 | // A new UID will be generated, indicating a new object.
|
---|
| 120 | //2 - Depending on the question, there can be multiple answers
|
---|
| 121 | // This means the values-argument (third argument) is an array.
|
---|
| 122 | $answer = new Answer(null, $question, array("12345", "four"));
|
---|
| 123 | //Save the answer in the database
|
---|
| 124 | $db->set($answer);
|
---|
| 125 | \end{verbatimtab}
|
---|
| 126 | \end{framed}
|
---|
| 127 | \end{figure}
|
---|
| 128 |
|
---|
| 129 | \begin{figure}[H]
|
---|
| 130 | \caption{Getting a Survey object from the database and removing the first question. Also alter this
|
---|
| 131 | question.}
|
---|
| 132 | \begin{framed}
|
---|
| 133 | \begin{verbatimtab}
|
---|
| 134 | //Get the survey
|
---|
| 135 | $surveyResults = $db->get("survey", array("uid" => "b91d39e8667372e220bb861b3f94b5bd"));
|
---|
| 136 | $survey = surveyResults[0];
|
---|
| 137 | //Remove the question
|
---|
| 138 | $question = $survey->questions[0];
|
---|
| 139 | unset($survey->questions[0]);
|
---|
| 140 | //Change the question
|
---|
| 141 | $question->title = "New Title";
|
---|
| 142 | //Save the survey and question
|
---|
| 143 | $db->batchSet(array($survey, $question));
|
---|
| 144 | \end{verbatimtab}
|
---|
| 145 | \end{framed}
|
---|
| 146 | \end{figure}
|
---|
| 147 |
|
---|
| 148 |
|
---|
| 149 | \subsection{Arguments}\label{arguments}
|
---|
| 150 | The arguments you can supply when calling the get() function differ greatly per type. Unfortunately, there is no
|
---|
| 151 | way around this, so here is a complete overview of arguments per type. \\
|
---|
| 152 | \begin{tabular}{| l | c | c | l |} \hline
|
---|
| 153 | \textbf{Type} & \textbf{Argument name} & \textbf{Argument type} & \textbf{Extra notes} \\ \hline \hline
|
---|
| 154 | \multirow{3}{*}{Answer} & uid & String & \\
|
---|
| 155 | & question & String & UID of the question \\
|
---|
| 156 | & values & Array of Strings & \\ \hline \hline
|
---|
| 157 | \multirow{4}{*}{AnswerSet} & uid & String & \\
|
---|
| 158 | & survey & String & UID of the Survey \\
|
---|
| 159 | & respondent & String & UID of the Respondent \\
|
---|
| 160 | & answers & Array of Strings & UIDs of the Answers \\ \hline \hline
|
---|
| 161 | \multirow{4}{*}{Application} & uid & String & \\
|
---|
| 162 | & title & String & \\
|
---|
| 163 | & description & String & \\
|
---|
| 164 | & style & String & \\ \hline \hline
|
---|
[172] | 165 | \multirow{5}{*}{Question} & code & String & \\
|
---|
[147] | 166 | & title & String & \\
|
---|
| 167 | & type & String & \\
|
---|
| 168 | & description & String & \\
|
---|
| 169 | & category & String & \\
|
---|
[172] | 170 | & definedanswers & Array of Strings & String values of possible answers \\ \hline \hline
|
---|
[147] | 171 | \multirow{3}{*}{Respondent/User}& uid & String & \\
|
---|
| 172 | & name & String & \\
|
---|
| 173 | & password & String & Use hashes to store passwords \\ \hline \hline
|
---|
| 174 | \multirow{6}{*}{Session} & uid & String & \\
|
---|
| 175 | & title & String & \\
|
---|
[160] | 176 | & creator & String & UID of the User \\
|
---|
[147] | 177 | & datetime & String & Unix Timestamp. No way to search on intervals (yet). \\
|
---|
| 178 | & applications & Array of Strings & UIDs of the Applications \\
|
---|
| 179 | & surveys & Array of Strings & UIDs of the Surveys \\
|
---|
| 180 | & answersets & Array of Strings & UIDs of the AnswerSets \\ \hline \hline
|
---|
| 181 | \multirow{4}{*}{Survey} & uid & String & \\
|
---|
| 182 | & title & String & \\
|
---|
| 183 | & description & String & \\
|
---|
| 184 | & creator & String & UID of the user that created this survey \\
|
---|
| 185 | & questions & Array of Strings & UIDs of the Questions \\ \hline \hline
|
---|
| 186 | \end{tabular} \\
|
---|
| 187 |
|
---|
| 188 | Whenever an Array has to be supplied, it will match for results that satisfy \textit{all} the constraints given
|
---|
| 189 | in the array.
|
---|
| 190 |
|
---|
[160] | 191 | \section{Specification}\label{spec}
|
---|
| 192 | \subsection{Models}
|
---|
| 193 | The framework heavily relies on the Object Oriented Programming paradigm, and only allows you to create, edit
|
---|
| 194 | and store data through instances of precreated classes. All these classes inherit from a global
|
---|
| 195 | ResearchToolObject class. A UML class diagram of the model classes can be found in Appendix \ref{models}.
|
---|
| 196 | \subsection{Connectors}
|
---|
| 197 | Although the front end developer only uses one general DatabaseInterface, the different files for the different
|
---|
| 198 | datatypes are accessed by seperate connectors. A connector has to implement the IConnector interface, which
|
---|
[172] | 199 | mainly enforces the get() and set() methods to ensure a connector can perform as expected. The rest of the
|
---|
| 200 | methods are implemented in the baseclass Connector, which should be extended. In every connector,
|
---|
[160] | 201 | the get() and set() methods are different. A get() method builds the querystring based on the given arguments,
|
---|
| 202 | then retrieves the data and performs other necessary queries (like retrieving a different object from another
|
---|
| 203 | file) to ensure the PHP DataModel is comoplete. A set() method has to store the data accordingly.
|
---|
| 204 | \subsection{PHPDoc}
|
---|
| 205 | PHPDoc for the database classes can be found at
|
---|
[172] | 206 | \href{http://svn.tbm.tudelft.nl/TBM-CPS/RESEARCHTOOL/Doc/Backend/PHPDoc/index.html}{SVN}.
|
---|
[147] | 207 |
|
---|
| 208 | \section{Future Work}\label{futurework}
|
---|
[179] | 209 | \subsection{Adhere to Open-Closed Principle}
|
---|
| 210 | At the moment, \textit{the DatabaseInterface.php} class violates the Open-Closed principle. Every new connector,
|
---|
| 211 | four lines need to be added in this class. This is unwanted if new RDF datatypes are added. The same holds for
|
---|
| 212 | the method createArguments in this class: extra cases have to be added if more types of arguments become possible.
|
---|
| 213 | Adhering to the Open-Closed principle will allow for easier extention and maintenance of the database-related classes.
|
---|
[147] | 214 |
|
---|
[179] | 215 | \subsection{Deal with invalid or missing values}
|
---|
| 216 | This goes in two parts. Firstly, the database now assumes that given an entry, all the fields for its type are there and
|
---|
| 217 | filled in. This is unwanted since it can cause issues with backwards compatability of certain .rdf files. Secondly, it
|
---|
| 218 | assumes that fields containing a UID to a different entry actually point to a valid entry. For instance, a Survey query will
|
---|
| 219 | try to get a User object as its creator value from the database as well, assuming it exists. It will break if this user object
|
---|
| 220 | does not exist.
|
---|
| 221 |
|
---|
[180] | 222 | \subsection{Optimize queries}
|
---|
| 223 | Queries as they are at the moment can be a bottleneck in performance if the sets get really large. This is because of a
|
---|
| 224 | couple of things. For one, there are duplicate queries being executed: if a Session object is retrieved, it also retrieves
|
---|
| 225 | all the corresponding AnswerSets, which in turn retrieves the corresponding Questions. However, the Session also
|
---|
| 226 | retrieves the Surveys related to it, which afterwards also retrieve the same questions. There has been no benchmarking
|
---|
| 227 | of performance so I have no clue at what size of the .rdf files this will become an issue.
|
---|
| 228 |
|
---|
| 229 | I have a gut feeling that the SPARQL queries can be optimized some as well. This is regarding entries with zero or more
|
---|
| 230 | elements of a specific type, like Survey. Now, there is a second query for getting all the IDs for all the questions, but
|
---|
| 231 | there must be a way to introduce this into the first query without getting the exponential growth of the RDF 'different
|
---|
| 232 | tree options'.
|
---|
| 233 |
|
---|
| 234 |
|
---|
[160] | 235 | \appendix
|
---|
| 236 | \pagebreak
|
---|
| 237 | \section{Class diagram of the models}\label{models}
|
---|
| 238 | \begin{figure}
|
---|
| 239 | \includepdf[scale=1.22]{DBModels.pdf}
|
---|
| 240 | \end{figure}
|
---|
[147] | 241 |
|
---|
| 242 | \end{document}
|
---|