Professional Documents
Culture Documents
1
System architecture & requirements guide
Contents
1 PRODUCT DESCRIPTION ___________________________________________________________________ 3
2 PRODUCT COMPONENTS __________________________________________________________________ 4
2.1 CORE COMPONENTS _______________________________________________________________________ 4
2.2 RELATED STAR STORAGE COMPONENTS __________________________________________________________ 4
2.3 COMPLEMENTARY 3RD PARTY PRODUCTS _________________________________________________________ 5
3 REFERENCE ARCHITECTURE ________________________________________________________________ 6
3.1 BASIC REFERENCE ARCHITECTURE FOR SEAL _______________________________________________________ 6
3.2 HIGH AVAILABILITY REFERENCE ARCHITECTURE FOR SEAL INSTALLATION ____________________________________ 8
3.3 CUSTOM REFERENCE ARCHITECTURE FOR SEAL INSTALLATION ___________________________________________ 9
3.4 CUSTOM ASSESSMENT _____________________________________________________________________ 10
4 SYSTEM REQUIREMENTS__________________________________________________________________ 11
4.1 HARDWARE REQUIREMENTS _________________________________________________________________ 11
4.1.1 SEAL core components ____________________________________________________________ 11
4.1.2 Related Star Storage components____________________________________________________ 12
4.1.3 Complementary 3rd party products __________________________________________________ 13
4.1.4 Content storage __________________________________________________________________ 13
4.1.5 Virtualization ____________________________________________________________________ 13
4.2 SOFTWARE REQUIREMENTS _________________________________________________________________ 13
4.2.1 Operating systems ________________________________________________________________ 14
4.2.2 Database systems ________________________________________________________________ 14
4.2.3 Recommended Internet browsers ____________________________________________________ 15
4.2.4 Java environment requirements _____________________________________________________ 15
4.3 OTHER REQUIREMENTS ____________________________________________________________________ 15
4.3.1 Microsoft SQL Server integration ____________________________________________________ 15
4.3.2 SMTP server integration ___________________________________________________________ 16
4.3.3 HCP platform integration __________________________________________________________ 16
4.3.4 Active Directory integration ________________________________________________________ 17
4.3.5 Microsoft .Net Framework _________________________________________________________ 17
5 DISASTER RECOVERY _____________________________________________________________________ 18
5.1 BACKUP AND RESTORE APACHE SOLR INDEX FILES __________________________________________________ 18
5.1.1 Activities on the Primary site ________________________________________________________ 18
5.1.2 Activities on the Disaster Recovery Site _______________________________________________ 18
5.2 SEAL DATABASE REPLICATION _______________________________________________________________ 19
5.3 CONTENT FILES REPLICATION ________________________________________________________________ 19
5.4 FINAL CONSIDERATIONS ____________________________________________________________________ 20
6 TECHNICAL ASSISTANCE __________________________________________________________________ 21
Enterprise index platform used for content and metadata index and fast
search operations. It is delivered as a standalone application (including
Apache SOLR Jetty container libraries).
7.3
Note: At least 3 Apache ZooKeeper services must also be installed when a
HA architecture is required, to create a SolrCloud environment.
Application The current supported server for SEAL core apps is WildFly 13.0.0.Final.
server
- File system
Content
Platform - Any storage managed by Tivoli Storage Manager V6.2 Fix Pack 3
Server (TSM 6.2.3)
3 Reference architecture
The deployment architecture for the SEAL platform, embedded and 3rd party components will
be established for each environments distinctly, based on the environment purpose (TEST or
PRODUCTION) and on the number of concurrent users that access the system and archive
ingestion rate (number and size of documents per unit time).
In the next sub-chapters some examples of system architectures are presented, including only
SEAL core components.
In a basic architecture, recommended for a reduced ingestion rate and concurrent users, all SEAL
core applications can be deployed on a single application server (one JVM instance), while the
Apache SOLR application can be deployed on distinct application server (second JVM instance)
installed on a different server, as described in the above image, or even on the same server.
The storage can be Hitachi Content Platform or other supported storage: file system, IBM Tivoli
Storage Manager or an EMC Documentum content server.
Figure 2
When the ingestion rate requirements increase and/or the number of concurrent users that will
be accessing the system, a scaled architecture is recommended, as described in the above
image.
This type of architecture will also fulfill high availability requirements.
SEAL core applications are installed in two application server clusters, while SOLR is deployed in
a SOLR Cloud distributed architecture, with two replica nodes. The SOLR Cloud will require
minimum three ZooKeeper services, which can be installed on three servers from the
architecture.
In order to scale independently, depending on the evolution of the application SEAL activities
are divided into:
- Synchronous activities - processes that involve web interface and must have a
reasonable response time to ensure user comfort.
- Asynchronous activities - processes involving large volumes of data and are designed to
be performed asynchronously (for example: data exports).
These activities are grouped in different J2EE applications to be installed on different servers, as
follows:
Based on the specific implementation requirements, when using the above reference
architecture, the ingestion, index & search and client applications can be scaled independently,
by increasing or decreasing the number of servers or the number of JVM installed in each cluster.
The storage can be Hitachi Content Platform or other supported storage: file system, IBM Tivoli
Storage Manager or an EMC Documentum content server.
Figure 3
For additional details, please contact Star Storage using contact details included in chapter 5.
Note 1: Consult a Star Storage technical consultant for requirements and best practices in
environments with more than 200 concurrent users or larger ingestion rates.
Note 2: Disk space does not include the space necessary for the data files. This space will be
estimated based on the volume of documents.
Note 3: Even for small environments, a 64-bit operating system is recommended.
Workstation/
Component Server CPU Disk space
memory
100 MB or more
SeeIt Server 2 vCPU (or 1x2.0 GHz
4 GB 1 GB recommended (for
Dual core)
cache files)
SeeIt Client N/A (It will be deployed along with SEAL core applications)
50 MB
SEAL Uploader 2 GB 2GHz single core (Electronic files storage
not included)
50 MB
SEAL Uploader Service 2 vCPU (or 1x2.0 GHz
4 GB (Electronic files storage
Dual core)
not included)
500 MB
Star Capture 5 Desktop 2 GB 2GHz single core (Electronic files storage
not included)
StarSignUI
2GHz single core /2 100 MB
StarSign Service 2 GB/4GB vCPU (or 1x2.0 GHz (Electronic files storage
SEALSigner Dual core) not included)
100 MB
SEAL PST Ingestor 2 GB 2GHz single core (Electronic files storage
not included)
Web Server (Apache Web, Nginx or similar 2 vCPU (or 1x2.0 GHz Dual
4 GB
HTTP server, per node) core)
4.1.5 Virtualization
SEAL core components can be installed on either a physical or a virtualized server, for instance
using Hyper-V or VMWare ESXi.
Cent OS 7
Cent OS 6
Microsoft Windows 7
All other SEAL client applications and related Star Storage products (SEALUploader,
SEALUploader Windows Service, StarSign Service, StarSignUI, and SeeIt Server) are compatible
with Microsoft Windows operating systems, at minimum MS Windows XP SP3.
PostgreSQL 10
During SEAL installation and setup, at least 2 databases will be created (one for the main product
configuration, and then a new database for each electronic archive). For this reason, SEAL uses
XA transactions (please check the installation guide for further details on how to activate XA
transactions).
Be sure to:
1. Have connectivity between SEAL server(s) and MS SQL Server (on port designated port, usually
1433)
2. Enable TCP/IP Network protocol for MS SQL Server
Be sure to:
1. Have connectivity between SEAL server(s) and SMTP server (on port 25)
2. Configure SMTP server to allow emails to be sent from SEAL server(s)
Be sure to:
1. Have connectivity between SEAL servers and HCP platform, on HTTP & HTTPS default ports
(80 and 443).
2. For management and troubleshooting purpose be sure you also have connectivity on console
port (the default is 8000)
Be sure to:
1. Have connectivity between SEAL server(s) and Active Directory server (on 389 port)
2. Create the security group and place all the users that will use the SEAL application within it.
3. All SEAL users have a configured email address (otherwise their accounts cannot be used with
SEAL application).
SeeIt Server
The order of operations in the disaster recovery plan schedule is very important and should be
the one mentioned above in order to avoid data inconsistencies:
- Database timestamp must always be older then the content timestamp (to avoid records
without content in case of a swith to the DR site);
- Database timestamp must be always newer than the index timestamp (in order to avoid
displaying in search results of records that are not found in the database).
The restore script must be scheduled in the disaster recovery plan in such a way that index files
are always older then the database version. Missing index values will be recomputed by SEAL
instance, when the DR site becomes the active site, by running the Rebuild SOLR index process.
Note: Starting with Apache Solr 7.1, as an alternative to the backup and restore procedure
described above, Cross Data Center Replication can be configured, as described in Apache Solr
SEAL database replication will be assured using specific database mechanism. The mechanism
used for metadata replication depends on RDBMS system in place.
For example:
1. Microsoft SQL Server: Database mirroring (Standard Edition) or Always On availability
groups (Enterprise Edition)
2. Oracle Database Server: Oracle Real Application Clusters + Oracle Streams (Standard
Edition) or Oracle Real Application Clusters + Oracle Active Data Guard (Enterprise
Edition)
Replication, a HCP service, is the process of keeping selected tenants and namespaces in two
HCP systems in sync with each other. This entails copying object creations, object deletions,
metadata changes, and other information between the two systems. Typically, the two systems
are in separate geographic locations and are connected by a high-speed wide area network.
Replication occurs between two separate HCP systems, each of which is complete in its own
right. Because replication is a software function, the two systems can have entirely different
hardware configurations, including differing amounts of storage. The two systems in a replicated
pair are connected through the front-end network infrastructure, as shown in the figure below.
Replication traffic from each system must be routable to the network selected for replication on
the other system.