Tutorial: Install the Data Hub Framework
1 - Set Up the Project Directory and Sample Data
- Create a directory called
data-hub
. This directory will be referred to as “your project root” or simply “root”. - Download the
quick-start-4.3.2.war
file and place it your project root directory. - Under your project root, create a directory called
input
. - Download the sample data .zip file. Expand it, as needed.
- Copy the subdirectories (e.g.,
campaigns
,customers
,orders
) inside the sample data .zip file into theinput
directory.
Result
Your project directory structure will be as follows:
data-hub ├─ quick-start-4.3.2.war └─ input ├─ campaigns ├─ customers ├─ issuehistories ├─ issues ├─ orders ├─ parties ├─ products │ ├─ games │ └─ misc ├─ responses └─ supportcustomers
2 - Start QuickStart
- Open a command-line window, and navigate to your DHF project root directory.
- Run the QuickStart .war.
- To use the default port number for the internal web server (port 8080):
java -jar quick-start-4.3.2.war
- To use a custom port number; e.g., port 9000:
java -jar quick-start-4.3.2.war --server.port=9000
NOTE: If you are using Windows and a firewall alert appears, click
Allow access
. - To use the default port number for the internal web server (port 8080):
Result
3 - Install the Data Hub
-
Open a web browser, and navigate to
http://localhost:8080
. -
Browse to your project root directory. Then click
. -
Click
to initialize your project directory. -
After initializing your Data Hub Framework project, your project directory contains additional files and directories. Click
. -
Choose the
local
environment, then click . -
Enter your MarkLogic Server credentials, then click
. -
Click
to install the data hub into MarkLogic. -
Wait for the installation to complete.
-
When installation is complete, Click
.
Result
When installation is complete, the Dashboard page displays the three initial databases and the number of documents in each.
- Staging contains incoming data.
- Final contains harmonized data.
-
Jobs contains data about the jobs that are run and tracing data about each harmonized document.