Dexter is a browser-based, domain-independent explorer for structured data for the everyday user.
Dexter allows you to browse and expressively combine practically any local and remote data. Dexter is not a traditional app that allow only fixed views on fixed set of data. Rather, Dexter gives end users full control on selection of data sources as well as on how they want to combine and view data.
To create a new empty table, press the + button. Data can be entered manually in the table cells in a spreadsheet like fashion. In addition, data can be imported from local or remote CSV, XML or JSON files, as well as created from selections on any web page as described below. Dexter also allows to connect to remote APIs or databases.
Importing data from local or remote files
For importing a CSV, JSON, or XML file, select 'Import or Paste' in the 'Tables' menu. Click 'Choose File' to select a local file, or enter the URL to point to a remote file. Sometimes, CSV files contain the property names in the first line. If this is the case and you want the property names to be automatically taken as column names, then check 'Schema in File?'. When done, press 'Extract'. The extraction may take a while and the progress is shown in the dialog. When complete (progess is 100%), then click 'Done' to finish. On completion, the extracted data will be added to the table.
In many cases, Web pages contain data that users want to analyze. Dexter allows users to extract data from Web pages semi-automatically by copying the fragment of a Web page, paste it into the 'Paste Web Page Fragment' field, click on the 'Extract' button, and on completion the 'Done' button. On completion, the extracted data is added to the table.
Connecting to APIs
For creating a connection to an API, select the 'Connect to API' item from 'Tables' menu. Fill up the fields. The fields 'Total Records' and 'Fields' can be left empty as they are automatically detected in the connection testing phase.
- 'Direct' means whether the connection to the API should be established directly or indirectly (i.e. via Dexter proxy server).
- Set the field 'Access Method' to GET or POST depending on the API. Most APIs support GET which is also the default value.
- Set 'Datatype' to JSONP if you have selected a direct connection and the API supports direct connection only with JSONP. Otherwise, leave this field to its default value JSON.
- Enter the URL of the API in the 'URL' field.
- When the API is invoked, it returns JSON data. Often, users wish only a specfic fragment of the whole output. In the 'JSON Path', enter the dot-separated path to the element in output that you wish to use. If you wish to use whole output, leave this field empty.
- Some APIs support filtering on properties of desired data. If this is the case and once you know which properties the output has after testing the connection, you can enter comma separated list of the numbers of the filterable fields (zero based index, i.e. first field has number 0, second field has number 1 and so on) into the field 'Filter Fields'. If you do not know if tha API has filtering capacity or do not wish to use it, leave the field empty.
Once done with filling up the fields, click 'Test Connection' button. Testing the connection and computing the number of records and the set of property names, may take a while, so be patient. When the connection is succesful, click the 'Ok' button to finish. After completion, a rule is automatically added in the area above the selected table. Roughly, this rule connects the content of the remote table to that of the selected table. More details on rules are described later in this document.
Connecting to Databases
- 'Host' is the name of the IP address of the database server.
- 'Database' is the name of the database.
- 'User Name' is the user name in the database that has access to the database.
- 'Password' is the password corresponding to the database user name.
- 'Folder Name' is the name of the folder where the thumbnails corresponding to the tables in the database should be created.
- 'Table Name' is the name of a database table for which the rule is it automatically created. Leave this field empty if no rule should be generated.
After filling up the fields, click the 'Test Connection' button. On successful testing of connection, press the 'Ok' button to finish. After completion, a new folder with the name you have specified is created. Selecting the folder in the list at top left, you can see the thumbnails corresponding to the database tables.
Editing Table Schema
Adding/Removing Rows or Columns
Next to the 'Tables' menu there are 5 buttons to edit the schema of a table. To add a row below a particular row, select any cell in the row by clicking on the cell, and then press the 'add row below' button (arrow down and horizontal bar). Adding a row above the current row, and adding a column on the left or right of the current column, work analogously. For removing a row or a column, select the row or the column by clicking on the row number or the column header. This will select the whole row or the whole column. Then press the 'delete row or column' button (a cross).
Editing Column Names
Click on the column name you wish to edit. The column name field becomes editable. Change the column name to the new column name, and leave the field by clicking outside the field.
Editing Table Name and Moving Tables
Name of the selected table is displayed in the middle of the header on the right side, and can be edited by clicking on the field and typing in the new table name. The first part of table name (the part before the dot) corresponds to the folder. The new name must comply with this convention. By changing the first part the table is automatically moved to its new location. A folder with the new name is created if it does not already exist.
Select the table you want to save by clicking its thumbnail and then select 'Save Table' item in the 'Table' menu.
Computing and Validating tables
Data Computation Rules
In addition to the data entered or imported into a table, a table can also contain computed data. The rules for computing table data can be entered in the text area above the table in Dexlog syntax. The rule editor is equipped with a context senstitive autocompletion which accelerates typing of syntactially correct rules. The autocompletion can be invoked by pressing Ctrl+Space at any time while typing inside the rule editor. We refer to Dexlog for details on the rule language syntax.
Data computation rules can be evaluted by pressing the 'Play' button. Depending on the complexity of the rules and the size of data in the tables referenced from the rules, the evaluation can take some time. During the evaluation, the progress is shown on the thumbnail, and the computed results are added to the table as soon as they become available in streaming fashion. A running evaluation can be stopped by pressing the 'Stop' button (This is the same as the 'Play' button which becomes 'Stop' button pressing). The computed data rows have a different background color than the manually entered data rows. Furthermore, the computed rows are not editable per default (see Section on Pinning/Unpinning to know more about editing computed data). We refer to Dexlog for details on the rule language syntax.
Constraints and Violation Pinpointing
The rule editor may also contain constraints to specfify the data that is not allowed in the table. The constraints are specified as rules in the same language as the computation rules, except that the head of a constraint is always 'illegal'. Constraints can be evaluated in same fashion as rules, i.e. by pressing the 'Play' button. Upon completion, the illegal rows of a table are highlighted red. We refer to Dexlog
for details on the rule language syntax.
Pinning / Unpinning tables
The computed non editable data of a table can be saved so that it does not need to be evaluated again and again by selecing 'Pin/Unpin' table in the 'Table' menu. On pinning a table, the computed data is stored locally along with the normal editable data, and the computed data becomes editable. Editable computed rows can be edited just like normal editable rows as described above.
When a table is already pinned, selecting 'Pin/Unpin' table in the 'Table' menu upins the table, i.e. removes the stored computed data from the storage. However, if some computed data was edited, Dexter asks the user if the changed rows should be saved or not.
To file system
To export a table to a file, select 'Export' and the appropriate sub-item in the 'Table' menu. The sub-item 'Data and Rules' export both the data as well as the rules of a table, whereas all other formats export only the data of a table.
To sharing server
In order to share a table (only the data), select 'Share Table' from 'Table' menu. This action uploads the table data to the Sharing Server. All Dexter users have access to shared tables. Shared tables are listed under the folder 'shared'. The name of the shared table is prefixed with the user's initials. By default the initials are 'xyz'. It is recommended that users change their initials to something unique (see next section for how to change intials and other settings).
- 'Chunksize' is the number of table rows fetched at one invocation of a remote API or database. Some APIs have a limit on the number of rows they return in one invocation. Set it to a number that is small enough to not cross the limits of the APIs that you use, and large enough to not cause too many invocations.
- 'Max Search Workers' is the maximum number of concurrent query evaluation threads running at a time. Roughly, higher the number faster would be the query evaluation, but regular client machines usually do not perform well with a very high number of threads. If you have unsure, try reducing the number to 2 and then 8, compare the performances in both cases with that with 4 workers, and choose accordingly
- 'Cache Size' This number represents the number of table chunks that are cached during the query evaluation so that they do not need to be fetched again. In general, higher the number better it is, but again a regular client machine may have memory limitations.
- 'User Name' Enter your full name here or whatever you would like to be identified with. This information is currently not used anywhere by Dexter. So, you can also leave it empty or to its default value.
- 'User Namespace' Enter your initials here. Your initials will be used as prefix to tables shared by you in order to differentiate your shared tables from shared tables of other users.
- 'JSON API Proxy URL' is the location of the proxy server to be used for connecting to APIs indirectly. The default value points to Dexter server. In rather advanced usage scenarios you may want to change it to some other location. However, note that the new proxy location should point to a functioning script compatible with Dexter.
- 'MySQL Connect URL' is the URL pointing to the script that is invoked for testing the connection to a MySQL database, and returns the metadata about the tables in the database. The default value points to a script on Dexter Server. Change the URL only in rather advanced use cases to another Dexter compatible script.
- 'MySQL Proxy URL' is the base of the URL used for establishing a connection to a MySQL database. The exact URL of a specific database is returned by the script at MySQL Connect URL. Again, you would want to change this URL only in very advanced use cases.
- 'Show Remote Tables' controls whether remote tables (non editable) should be shown as thumbnails or not.