Shinken's architecture aims to offer easier load balancing and high availability. The administrator manages a single configuration, the system automatically "cuts" it into parts and dispatches it to worker nodes. It takes its name from this functionality: a Shinken is a Japanese sword.
Shinken was written by Jean Gabès as a proof of concept for a new Nagios architecture. Believing the new implementation was faster and more flexible than the old C code, he proposed it as the new development branch of Nagios 4.[3] This proposal was turned down by the Nagios authors, so Shinken became an independent network monitoringsoftware application compatible with Nagios.[4]
Using agents permitting remotely run scripts via Nagios Remote Plugin Executor (An embedded pure-Python implementation is included with Shinken)
Using agent-less methods such as SNMP, WMI, scripted SSH or HTTP(SSL)
Send check results directly from programs using Apache Thrift (Java, Python, Ruby)
Monitoring of systems which have the ability to send collected data via a network to specifically written plugins (Ex. VMWare ESX3/4/5, Collectd)
Remote monitoring supported through SSH or SSLencrypted tunnels.
Simple plugin design that allows users to easily develop their own service checks depending on needs, by using the tools of choice (shell scripts, C++, Perl, Ruby, Python, PHP, C#, etc.)
Ability to calculate KPIs from State and performance data in the Shinken core to create new services and performance data
System external interfaces
Livestatus compatible API that exposes state, configuration and performance information
Exports data to graphing modules (PNP4Nagios, Graphite, and others available)
Support for native messaging API of Android
Export event data to logging systems using syslog and RabbitMQ
Modules can be attached to any Shinken process to extend its capabilities in very efficient ways
Performance
Parallelized service and host checks available
Ability to distribute poller processes on multiple servers
Support for implementing easily redundant and load balanced monitoring hosts
Support for multiple redundant external interfaces
Ability to route checks to dedicated pollers (processes specialized in executing plugins)
Correlation and business intelligence
Parent child relations
Ability to define network host hierarchy using "parent" hosts, allowing detection of and distinction between hosts that are down and those that are unreachable
1 to 1, 1 to N
Free form dependency trees between any service and host
1 to 1, 1 to N
Support for integrated business rules
Calculated hosts or services representing the state of a business service
Support assigning a business impact to each service, host or business process
Ability to show only root problems
Automatically changes child states to unknown when parent is unavailable
Other features
Contact notifications when service or host problems occur and get resolved (via e-mail, pager, SMS, or any user-defined method through plugin system)
Ability to define event handlers to be run during service or host events for proactive problem resolution
Ability to redefine the severity of an alert based on regular expression rules
Support for UTF-8 objects names
Support for monitoring multiple customers with one administration point
Support for recurring downtimes through the maintenance_period attribute
Advanced template system with inheritance and overloading
A Shinken installation consists of several processes, each optimized for a specific task.
Arbiter
Loads the configuration files and dispatches the host and service objects to the scheduler(s)
Watchdog for all other processes and responsible for initiating failovers if an error is detected
Can route check result events from a Receiver to its associated Scheduler
Arbiter modules
There is a variety of modules to manipulate configuration data
Scheduler
Plans the next run of host and service checks
Dispatches checks to the poller(s)
Calculates state and dependencies
Applies KPI triggers
Raises Notifications and dispatches them to the reactionner(s)
Updates the retention file (or other retention backends)
Sends broks (internal events of any kind) to the broker(s)
Poller
Gets checks from the scheduler, execute plugins or integrated poller modules and send the results to the scheduler
Poller modules
NRPE - Executes active data acquisition for Nagios Remote Plugin Executor agents
SNMP - Executes active data acquisition for SNMP enabled agents (In beta stage using PySNMP)
CommandPipe - Receives passive status and performance data from check_mk script, will not process commands
Reactionner
Gets notifications and eventhandlers from the scheduler, executes plugins/scripts and sends the results to the scheduler
Broker
Has multiple modules (usually running in their own processes)
Gets broks from the scheduler and forwards them to the broker modules
Modules decide if they handle a brok depending on a brok's type (log, initial service/host status, check result, begin/end downtime, ...)
Modules process the broks in many different ways. Some of the modules are:
webui - updates in-memory objects and provides a webserver for the native Shinken GUI
livestatus - updates in-memory objects which can be queried using an API by GUIs like Thruk or Check_MK Multisite
graphite - exports data to a Graphite database
ndodb - updates an ndo database (MySQL or Oracle)
simple_log - centralize the logs of all the Shinken processes
status_dat - writes to a status.dat file which can be read by the classic cgi-based GUI
Receiver (optional)
Receives data passively from local or remote protocols
Passive data reception that is buffered before forwarding to the appropriate Scheduler (or Arbiter for global commands)
Allows to set up a "farm" of Receivers to handle a high rate of incoming events
Modules for receivers
NSCA - NSCA protocol receiver
Collectd - Receive performance data from collectd via the network
CommandPipe - Receive commands, status updates and performance data
TSCA - Apache Thrift interface to send check results using a high rate buffered TCP connection directly from programs
Web Service - A web service that accepts http posts of check results (beta)
There can be multiple instances for each type of process, either on a single host or spread over many hosts. Adding more processes automatically distributes the load.
The Shinken WebUI is the builtin Web interface that provides near real time status information, configuration, interaction, a dashboard to visualize trending data from Graphite databases and the visualization of dependency tree graphs.
The Shinken skonfUI is an independent web front-end used to manage the discovery process and configuration tasks.
The shinken-admin CLI script is used to manage during runtime process level aspects of the system, such as changing logging levels and getting health reports.
The install.sh CLI script is the main management script to install, remove or update Shinken and its associated software.
Development
Shinken has an open and test-driven development approach, with contributors to the project providing new features, code refactoring, code quality and bug fixing.[5]
The source code is hosted on GitHub.[6] An integration server runs tests at each commit and in depth tests at regular intervals.
^Gabès, Jean (2009-12-01). "Shinken : a new implementation proposal". GitHub. Retrieved 2014-03-04. I would like to have your feed back about a (unfinished) reimplementation of Nagios named "Shinken" I wrote in Python that is faster and more modular than the current Nagios implementation in C
^Gabès, Jean (2010-06-01). "Shinken : a mix with Nagios is not possible". Shinken team. Archived from the original on 2014-01-23. Retrieved 2010-06-01. We never got an answer for the initial Shinken proposal because we are seen as a renegade project. In fact, now we can say that we are a fork.