source: Doc/Backend/DBInterface.tex @ 189

Last change on this file since 189 was 189, checked in by jkraaijeveld, 13 years ago

Updated .pdf documentation. Model diagram is still outdated though.

File size: 12.4 KB
RevLine 
[147]1\documentclass{article}
2\title{Database Connection for the CPS Facilitator Tool}
3\author{Jos Kraaijeveld}
4\date{\today}
5\usepackage{moreverb}
6\usepackage{fullpage}
7\usepackage{framed}
8\usepackage{float}
9\usepackage{multirow}
[160]10\usepackage[final]{pdfpages}
11\usepackage{hyperref}
[147]12\begin{document}
13
14\maketitle
15
16\pagebreak
17
18\tableofcontents
19
20\pagebreak
21
22\section{Introduction}
23        This document describes the database design and implementation for the CPS Facilitator Tool. The goal is to make
24        sure everyone can communicate with the RDF-based database without having to write queries. The DBInterface is
25        written in PHP and requires slight PHP knowledge before use. It is heavily based on initial work by Bas van Nuland. \\
26               
27        If you use this framework, you can skip to the `How to use'-part of this document (\ref{howtouse}). If you are
28        planning to improve this framework, I recommend reading the RDF Primer\footnote{http://www.w3.org/TR/rdf-primer/} and
29        SPARQL Query Language for RDF\footnote{http://www.w3.org/TR/rdf-sparql-query/}. A full specification of the current
30        implementation can be found in section \ref{spec}. The areas I suggest improving upon first are described in section
31        \ref{futurework}.
32
33\section{How to use}\label{howtouse}
34        \subsection{Initialization}
35                Before the database can be used, you should initialize a \textit{DatabaseInterface} object. This is the only
36                database class you will use. This is done calling the DatabaseInterface constructor.
37                \begin{verbatimtab}
38                        $db = new DatabaseInterface();
39                \end{verbatimtab}
40
41        \subsection{Basic queries}
42                \subsubsection{Retrieving data}
43                        Getting data is done by calling the get method of the DatabaseInterface class. The specification for this
44                        method is as follows. \\ \\
45                        \begin{tabular}{| l || r |} \hline
46                                \multicolumn{2}{|c|}{Function get()} \\ \hline
47                                First argument: \$type  & String \\ \hline
48                                Second argument: \$arguments & Array \\ \hline
49                                Return type: & Array/Variable \\ \hline
50                        \end{tabular}  \\ \\
51
52                        The return type is an Array of objects with the type specified as the first argument. For instance, if
53                        the first argument is ``user", the function will return an array of User objects. The second argument can be
54                        left out if you want to retrieve all entries of a specific type. Below are a few examples showing the usage
55                        of get(). A description of what arguments are possible per type is given in section \ref{arguments}.
56
57                        \begin{figure}[H]
58                                \caption{Retrieving all questions and echo their title.}
59                                \begin{framed}
60                                        \begin{verbatimtab}
61$questions = $db->get("question");
62foreach ($questions as $question)
63{
64        echo $question->title;
65}
66                                        \end{verbatimtab}
67                                \end{framed}   
68                        \end{figure}   
69
70                        \begin{figure}[H]
71                                \caption{Retrieving all surveys containing questions with IDs q1 and q2, created by the user
72                                        Jos, printing all questions in those surveys}
73                                \begin{framed}
74                                        \begin{verbatimtab}
75//To get all surveys created by the user Jos, we need his UID.
76//We first query the database for the User object belonging to Jos.
77$userResults = $db->get("user", array("name" => "Jos"));
78//Assuming there is a result, the UID is:
79$josUID = $userResults[0]->uid;
80//Now to get the requested surveys:
81$surveys = $db->get("survey", array("questions" => array("q1", "q2"),
82                                    "creator" => $josUID));
83//And to print all the questions in these surveys:
84foreach($surveys as $survey)
85{
86        echo "All the questions in " . $survey->name;
87        foreach($survey->questions as $question)
88        {
89                print_r($question);
90        }
91}
92                                        \end{verbatimtab}
93                                \end{framed}
94                        \end{figure}   
95
96                \subsubsection{Storing data}
97                        Storing data is easy, as long as you stick to using the given PHP Classes. Retrieve these classes by using
98                        the get() function, and pass them as a parameter to the set() function of the DatabaseInterface. This set()
99                        function will determine what type the object is and store it at the correct location. The method overview is
100                        as follows: \\ \\
101                        \begin{tabular}{| l || r |} \hline
102                                \multicolumn{2}{|c|}{Function set()} \\ \hline
103                                First argument: \$rToolObject  & Variable \\ \hline
104                        \end{tabular}  \\ \\
105                       
106                        Another important thing to note is that currently it will overwrite a previous object with the same UID in
107                        the database. The following examples show how to create new objects and save them, as well as edit old
108                        objects and save them.
109
110                        \begin{figure}[H]
111                                \caption{Creating a new Answer object and storing this in the database.}
112                                \begin{framed}
113                                        \begin{verbatimtab}
114//For this example, I choose a random question to answer                                       
115$questions = $db->get("question");
116$question = $questions[2];
117//Note two things:
118//1 - If you pass 'null' as first argument when creating any object
119//        A new UID will be generated, indicating a new object.
120//2 - Depending on the question, there can be multiple answers
121//        This means the values-argument (third argument) is an array.
122$answer = new Answer(null, $question, array("12345", "four"));
123//Save the answer in the database
124$db->set($answer);
125                                        \end{verbatimtab}
126                                \end{framed}   
127                        \end{figure}   
128
129                        \begin{figure}[H]
130                                \caption{Getting a Survey object from the database and removing the first question. Also alter this
131                                        question.}
132                                \begin{framed}
133                                        \begin{verbatimtab}
134//Get the survey
135$surveyResults = $db->get("survey", array("uid" => "b91d39e8667372e220bb861b3f94b5bd"));                               
136$survey = surveyResults[0];
137//Remove the question
138$question = $survey->questions[0];
139unset($survey->questions[0]);
140//Change the question
141$question->title = "New Title";
142//Save the survey and question
143$db->batchSet(array($survey, $question));
144                                        \end{verbatimtab}
145                                \end{framed}
146                        \end{figure}
147
148
149        \subsection{Arguments}\label{arguments}
150                The arguments you can supply when calling the get() function differ greatly per type. Unfortunately, there is no
151                way around this, so here is a complete overview of arguments per type. \\
152                \begin{tabular}{| l | c | c | l |} \hline
153                        \textbf{Type}                                   & \textbf{Argument name}        & \textbf{Argument type}        & \textbf{Extra notes}  \\ \hline \hline
154                        \multirow{3}{*}{Answer}                 & uid                                           & String                                        &                                               \\
155                                                                                        & question                                      & String                                        & UID of the question   \\
156                                                                                        & values                                        & Array of Strings                      &                                               \\ \hline \hline
157                        \multirow{4}{*}{AnswerSet}              & uid                                           & String                                        &                                               \\
158                                                                                        & survey                                        & String                                        & UID of the Survey             \\
159                                                                                        & respondent                            & String                                        & UID of the Respondent \\
160                                                                                        & answers                                       & Array of Strings                      & UIDs of the Answers   \\ \hline \hline
161                        \multirow{4}{*}{Application}    & uid                                           & String                                        & \\
162                                                                                        & title                                         & String                                        & \\
163                                                                                        & description                           & String                                        & \\
164                                                                                        & style                                         & String                                        & \\ \hline \hline
[172]165                        \multirow{5}{*}{Question}               & code                                          & String                                        & \\
[147]166                                                                                        & title                                         & String                                        & \\
167                                                                                        & type                                          & String                                        & \\
168                                                                                        & description                           & String                                        & \\
169                                                                                        & category                                      & String                                        & \\
[172]170                                                                                        & definedanswers                                        & Array of Strings                      & String values of possible answers \\ \hline \hline
[147]171                        \multirow{3}{*}{Respondent/User}& uid                                           & String                                        & \\
172                                                                                        & name                                          & String                                        & \\
173                                                                                        & password                                      & String                                        & Use hashes to store passwords \\ \hline \hline
174                        \multirow{6}{*}{Session}                & uid                                           & String                                        & \\
175                                                                                        & title                                         & String                                        & \\
[160]176                                                                                        & creator                                       & String                                        & UID of the User \\
[147]177                                                                                        & datetime                                      & String                                        & Unix Timestamp. No way to search on intervals (yet). \\
178                                                                                        & applications                          & Array of Strings                      & UIDs of the Applications \\
179                                                                                        & surveys                                       & Array of Strings                      & UIDs of the Surveys \\
180                                                                                        & answersets                            & Array of Strings                      & UIDs of the AnswerSets \\ \hline \hline
181                        \multirow{4}{*}{Survey}                 & uid                                           & String                                        & \\
182                                                                                        & title                                         & String                                        & \\
183                                                                                        & description                           & String                                        & \\
184                                                                                        & creator                                       & String                                        & UID of the user that created this survey \\
185                                                                                        & questions                                     & Array of Strings                      & UIDs of the Questions \\ \hline \hline
186                \end{tabular} \\
187
188                Whenever an Array has to be supplied, it will match for results that satisfy \textit{all} the constraints given
189                in the array.
190                       
[160]191\section{Specification}\label{spec}
192        \subsection{Models}
193                The framework heavily relies on the Object Oriented Programming paradigm, and only allows you to create, edit
194                and store data through instances of precreated classes. All these classes inherit from a global
195                ResearchToolObject class. A UML class diagram of the model classes can be found in Appendix \ref{models}. 
[189]196                The most important thing to note is that references to other objects are not evaluated immediately. Rather,
197                these classes with references have to override the evaluate() function, which in turn tries to query the
198                database and resolve those UIDs to model objects. This gives us two advantages: initial queries do not scale
199                exponentially because of the 'lazy' evaluation, and the boolean return value notifies us when there exists an
200                incorrect reference. By evaluating before saving an object, we ensure that there cannot exist invalid values in
201                the database.
[160]202        \subsection{Connectors}
203                Although the front end developer only uses one general DatabaseInterface, the different files for the different
204                datatypes are accessed by seperate connectors. A connector has to implement the IConnector interface, which
[172]205                mainly enforces the get() and set() methods to ensure a connector can perform as expected. The rest of the
206                methods are implemented in the baseclass Connector, which should be extended. In every connector,
[160]207                the get() and set() methods are different. A get() method builds the querystring based on the given arguments,
[189]208                then retrieves the data and performs other necessary queries (like retrieving a set of fields of unspecified
209                length) to ensure the PHP DataModel is complete. A set() method has to store the data accordingly.
210
[160]211        \subsection{PHPDoc}     
212                PHPDoc for the database classes can be found at
[172]213                \href{http://svn.tbm.tudelft.nl/TBM-CPS/RESEARCHTOOL/Doc/Backend/PHPDoc/index.html}{SVN}.
[147]214
215\section{Future Work}\label{futurework}
[179]216        \subsection{Adhere to Open-Closed Principle}
217                At the moment, \textit{the DatabaseInterface.php} class violates the Open-Closed principle. Every new connector,
218                four lines need to be added in this class. This is unwanted if new RDF datatypes are added. The same holds for
219                the method createArguments in this class: extra cases have to be added if more types of arguments become possible.
220                Adhering to the Open-Closed principle will allow for easier extention and maintenance of the database-related classes.
[147]221
[179]222        \subsection{Deal with invalid or missing values}
223                This goes in two parts. Firstly, the database now assumes that given an entry, all the fields for its type are there and
224                filled in. This is unwanted since it can cause issues with backwards compatability of certain .rdf files. Secondly, it
225                assumes that fields containing a UID to a different entry actually point to a valid entry. For instance, a Survey query will
226                try to get a User object as its creator value from the database as well, assuming it exists. It will break if this user object
227                does not exist.
228
[180]229        \subsection{Optimize queries}
230                Queries as they are at the moment can be a bottleneck in performance if the sets get really large. This is because of a
231                couple of things. For one, there are duplicate queries being executed: if a Session object is retrieved, it also retrieves
232                all the corresponding AnswerSets, which in turn retrieves the corresponding Questions. However, the Session also
233                retrieves the Surveys related to it, which afterwards also retrieve the same questions. There has been no benchmarking
234                of performance so I have no clue at what size of the .rdf files this will become an issue.
235
236                I have a gut feeling that the SPARQL queries can be optimized some as well. This is regarding entries with zero or more
237                elements of a specific type, like Survey. Now, there is a second query for getting all the IDs for all the questions, but
238                there must be a way to introduce this into the first query without getting the exponential growth of the RDF 'different
239                tree options'.
240
241
[160]242\appendix
243\pagebreak
244\section{Class diagram of the models}\label{models}
245        \begin{figure}
246                \includepdf[scale=1.22]{DBModels.pdf}
247        \end{figure}   
[147]248
249\end{document}
Note: See TracBrowser for help on using the repository browser.