Logische Grundlagen der Datenbanktheorie ws13
Dozent  Tadeusz Litak 
Termine und Ort  WS2013/14 Fri 10:3014:00, Seminar Room of Informatik 8 (11 floor) (note the change!) 
Umfang  4 SWS – 7,5 ECTS 
Angaben  Vorlesung mit Übung, geeignet als Schlüsselqualifikation, für Gasthörer zugelassen, Unterrichtssprache Deutsch oder Englisch 
Studienfächer / Studienrichtungen  WPF INFMA ab 7 (ECTSCredits: 7,5), WPF INFBAVTHI ab 4 (ECTSCredits: 7,5), WPF INFBAVTHI ab 4 (ECTSCredits: 7,5), WF MBA ab 4 (ECTSCredits: 7,5), WF MBA ab 7 (ECTSCredits: 7,5) 
Slogans
Database queries are formulas, formulas are database queries.
Database instances are finite models, finite models are database instances.
News, lectures, homework

Lecture 3: (corrected) lecture notes, homework (reposted without confusing “Exercise 1” in the header for your archives).

Lecture 4: lecture abstract,
homework(see below). 
Lecture 5: lecture abstract, combined homework for Lectures 4 and 5.

Lecture 6: lecture abstract, a small Bonusaufgabe.

Lecture 11: see corrected final version of the abstract
Highlights

We will begin with a brief introduction explaining our motivating slogans (above). We will see the good old notions of model and satisfaction of formula from GLoLoP in an entirely new light. In particular, we will realize that formalisms which laid theoretical foundations for SQL and relational databases—the tuple calculus (in its named and unnamed version) and relational algebra—are variants of firstorder logic on finite models. We will also quickly look at various extensions which wormed its way into SQL, such as counting, and briefly wonder what such things would correspond to. We’ll also see see how XML enters the picture: various formalisms for navigating/querying XML documents (think fragments of XPath, for example) are just variants of logic of finite trees. With XML and graph databases (think RDF, ontologies etc.), we will also see modal logic entering the picture. We will need thus to characterize the precise relationship between modal and predicate logic: on arbitrary models, on finite models and on finite trees. This will be our first encounter with a few celebrated results, such as the Van BenthemRosen Theorem.

Finite models are a foreign country. They do things differently there. Seeing now model theory and modal/predicate logic all over the place, we will need a reminder/overview of how things work on arbitrary (i.e., possibly infinite) models. This again should be roughly familiar from introductory logic classes like GLoLoP, but I don’t want to assume too much knowledge. Thus, we will overview/recall, e.g., how the set of formulas valid on arbitrary models is recursively enumerable or what consequences compactness has. We will also see various standard modeltheoretic characterizations of universal, existential, positive or existential positive formulas (the last we can now see as unions of conjunctive queries). Then we will realize with horror that when attention is restricted to finite models, most of these results and characterizations break down. Not to lose too much spirit: there are a few telling exceptions! In particular, the Van BenthemRosen Theorem does survive on finite models. Unions of conjunctive queries also fare surprisingly well. These are powerful and amazing logical results, established relatively recently.

Furthermore, some of questions and results which arise on finite models could not even be thought of in the context of oldfashioned model theory. This includes the amazing 01 laws. They are a mathematical beauty in their own right. But we will also see with amazement how such theorems provide instant arguments why some queries just cannot be expressed in firstorder logic. We will also learn about more standard tools used to show inexpressibility of queries/properties in a given formalisms, such as various variants of EhrenfeuchtFraisse games.

Finally, if certain natural queries/properties cannot be expressed in basic predicate (or modal) logic, the obvious question to ask is: what if we extend the language? The first thing to think of seems recursion, which arises anyway in database formalisms like Datalog. How do extensions of predicate/modal logic with recursion look like? Which of results and techniques discussed above still apply?

As you see, there are jungles, mountains and vast space out there. While there are basics we just have to cover, pretty soon we’ll have to make choices between various paths worth of exploring and individual preferences/needs of participants can be taken into account here. There is no way we can possibly cover everything in a 4 SWS semester anyway!
Ever wondered what logicians are doing at database conferences?
Ever wondered why core fragments of SQL or XPath look the way they do?
Ever wondered what “finite model theory” is all about?
Ever wondered why some queries are so easy to handle and some seem impossible even to write?
Whether you get your kicks from SQL, XML or just plain good math, come and see databases the way you never saw them before!
Contents
We will discuss the theoretical foundations of databases, starting with the underlying data model. We will focus mostly on various flavours of the relational model (calculus vs. algebra, named vs. unnamed variant etc.) and its reallife SQL incarnation, but other models, in particular semistructured databases and their reallife XML incarnation will also get their share of attention. We will investigate the connection between databases and finite model theory and overview some most interesting results in the field. Logical issues in the design of database query languages, in particular expressive completeness, but also questions of decidability and complexity will be of particular importance. Same applies to the use of logical tools for databasetheory purposes, from EhrenfeuchtFraisse games to Herbrand structures.
Lernziele und Kompetenzen
Wissen  The students will learn basic notions and results of finite model theory, with particular attention to notions and theorems relevant for database theory and essential characteristics of expressivity, decidability and complexity. As prerequisites, we will also recall and overview basics of SQL, relational algebra or XML query/navigation languages like XQuery and XPath 

Verstehen  The students will be able to summarize and prove major results of finite model theory and related areas relevant from the database point of view. Examples include equivalences between relational calculus and relational algebra, named and unnamed perspective, expressive completeness of relational calculus and similar results for XPath and XML query languages. The students will be able to explain the deep difference between finite and infinite model theory, by using 01 laws or by showing the failure of most preservation theorems 
Anwenden  The students will be able to use basic tools of finite model theory (and related areas relevant from a database point of view), such as EhrenfeuchtFraisse games or Herbrand structures. The students will use preservation theorems of Rosen or Rossman to characterize expressive power of logics over finite structures 
Analysieren  The students will be able to determine, e.g., whether a given SQLlike or XPathlike query is expressible over all finite models (or restricted subclasses of structures) in a chosen language 
Evaluieren (Beurteilen)  The best students, having gained a deep understanding of finite model theory, will be able to examine and evaluate choices involved in the design of basic database query languages, with particular attention to the criterion of expressive completeness and contrasting expressive power of a given language with its computational complexity 
Erschaffen  Making the right design choices is an important skill whenever the need arises for a new domainspecific language. You can think of database query languages just as an excellent case study given their omnipresence in everyday life. 
Literature

Serge Abiteboul, Richard B. Hull, Victor Vianu. Foundations of Databases. AddisonWesley, 1995.

Serge Abiteboul, Peter Buneman, Dan Suciu. Data on the Web : From Relations to Semistructured Data and XML. Morgan Kaufmann 1999

HeinzDieter Ebbinghaus, Jörg Flum, Finite Model Theory, Springer 1995

Paris C. Kanellakis. Elements of relational database theory, in Handbook of theoretical computer science (vol. B) Pages 1073 – 1156, MIT Press Cambridge, MA, USA 1990

Phokion Kolaitis. The Expressive Power of Logics on Finite Models, in Finite Model Theory and its Applications, EATCS Series: Texts in Theoretical Computer Science, Springer 2007

Leonid Libkin. Elements of Finite Model Theory, Springer 2012
————Supplementary reading on databases——————

Hector GarciaMolina, Jeffrey D. Ullman, Jennifer Widom. Database Systems: The Complete Book, PrenticeHall 2008.

Jeffrey D. Ullman, Jennifer Widom. First Course in Database Systems, Prentice Hall 2008

Raghu Ramakrishnan and Johannes Gehrke. Database Management Systems, McGrawHill 2007
———–Supplementary reading on logic and classical (unrestricted) model theory———

Herbert B. Enderton. A Mathematical Introduction to Logic (1 ed. 1972). Academic Press Second edition, 2001. ISBN 9780122384523

Wilfrid Hodges. Model theory, Cambridge University Press, Cambridge 1993, 772pp.