Pure Storage advanced Monitoring with Splunk: Indexing is Power - Part 2 - FlashBlade with Splunk

[ NOTE: machine translation with the help of DeepL translator without additional proofreading and spell checking ]


The last blog post was about general information about the blog series "Pure Storage advanced Monitoring with Splunk" and about the basic installation of Splunk Enterprise.


Short digression


... Pure Storage can also be your storage/repository for your Splunk instances. Splunk needs to process/index data fast and of course needs corresponding base for that. - Marcel Düssil in blog post "Pure Storage advanced Monitoring ... - Part 1" as of 07/16/2019

I would like to take up this statement again briefly and provide technical support.


Pure Storage FlashArray and FlashBlade are high-performance storage systems for the most demanding requirements. With its systems, Pure Storage offers solutions for SMEs and, depending on customer requirements (performance or storage capacities), scales to the highest enterprise environment. In all my time, I myself have never been unable to meet any requirement for a block storage or file storage system with Pure Storage!


For this reason Pure Storage is also the optimal basis for Splunk data storage. For enterprise Splunk environments, FlashBlade technology should be used. In smaller environments, however, a FlashArray can also be used.


The following slides speak for themselves ... - take a look at it and get in touch with the Pure power.


Source: Pure Storage

Source: Pure Storage

FlashBlade offers various protocols (SMB, NFS, S3) as object storage. An elementary advantage, which Pure Storage presents here in direct comparison to other manufacturers: the whole thing on a namespace! Among other things, this enables global data reduction.

Source: Pure Storage

Source: Pure Storage

Setup FlashBlade for Splunk


This post is about integrating Splunk with a Pure Storage FlashBlade for data collection and monitoring.


As mentioned earlier, I had already downloaded the Pure Storage Splunk apps in the previous blog in preparation for the rest of the blog.


We log in to the Splunk Enterprise Server with the administration data we created earlier.

After successful login, you land directly on the Splunk Enterprise start page.

The first thing we do is import the Pure Storage Splunk Apps for FlashBlade.

There are two different ones here in Splunk Base, and they differ significantly in function.

The "PureStorage FlashBlade App" is for user visualization, which means that it provides predefined dashboards. The visualization can be filtered on different areas (filesystems, objects, policies, alarms, ...). This is similar to what we know from Pure1.

The "PureStorage FlashBlade Add-On For Splunk" on the other hand is an application that provides the interface to the FlashBlade. It can execute scripts and collect data through the API's. The Add-On is also often referred to as TA (= Technology Add-On) in Splunk language. With the FlashBlade Add-On, Splunk connects to the REST API of the FlashBlade to get information.


App-Import


From the Apps > Manage Apps > "Install App from File" menu bar, we can import the previously downloaded apps. The apps for Pure Storage have a very small file size, less than 2MB.




After successful app import, a reboot of the Splunk instance is necessary. All Splunk services are completely restarted, which means that ... no Splunk services are available for other integrated platforms for a short time. In a productive Splunk environment, this dependency should be taken into account.


After the restart, the apps are available and you can directly continue with the configuration.


Optional/further preparations


In the course of blog preparations and research on Splunk, I contacted a contact recommended to me - an absolute Splunk freak (contact can be made) - he succinctly provided me with more useful information on how to optimize the Splunk instance.


One important point - according to the expert - there is to dedicate to the Splunk indices. With the default settings there is a main index (default) in which all collected data from Splunk would be stored. This is suboptimal for various reasons and you should definitely use dedicated indexes in productive instances. For example, you can also specify a defined maximum size for each index or other paths for data storage. In addition, in the event of an index error, the complete index of all systems would not be affected, but only the respective instance.


In our case, it makes sense to create a separate index for all Pure Storage devices. However, if it is a pure test environment, this step can be omitted.


This index can be created in the menu bar under Settings > Indexes > "NewIndex". I have named this "purestorage". All other settings remained with the default values.





The FlashBlade Add-On/TA Configuration


In order for Splunk to actually receive and process data, a connection/interface to the FlashBlade must be configured. For this we had imported the FlashBlade add-on a few minutes before.


Via the menu bar > Data Inputs > PureStorage FlashBlade > "Add New" we can prepare this interface. The final configuration is then done via the Pure-TA itself. At this point we need to define global settings for the Pure Storage FlashBlade systems in advance.

The name to be specified is not decisive for the configuration! This is a pure alias for the deposited user/global account and does not serve the connection establishment (IP/DNS). You can create here for example a global Pure-Splunk function user, which acts the interface between Pure Flash systems and Splunk.

The actual configuration is done from within the TA. To do this, we switch to the "PureStorage FlashBlade Add-On For Splunk" via the menu > Apps >.


At this point it is time to generate the API key on the FlashBlade. To do this, we log in with the user to be used via CLI/SSH at the VIP (virtual ip address) of the FlashBlade. Afterwards we can generate the API key with "pureadmin create --api-token" and copy/paste it.



INFO: it is recommended to save the API key for further use! The key can no longer be read from the system after the end of the session. All Purity users can generate their own API token! The API token is valid until it expires, is deleted or is regenerated.


In the add-on, below the menu bar, we find the "Configuration" section. This is where we configure the interface.


As "Account Name" we enter our user, with which we had previously generated our API key. For the "Server Address" we have to enter the VIP in URL format, specifying the https protocol type, and simply insert the API token. If a valid SSL certificate is integrated on the FlashBlade, this can be checked via the checkbox (optional).

From this point on, Splunk starts collecting data directly*.

Now we switch to the TA tab "Inputs" and here we still edit the Splunk FlashBlade settings.


With the default settings, Splunk would address the FlashBlade API every 5 minutes/300 seconds and collect data into the main index synonym default. Here we link the "purestorage" index we created earlier to the system in question.


The data collection can be checked via the TA tab "Search" with the command " index="purestorage" ". As you can see: data is received successfully.

If problems occur at this point, optional parameters can be used to troubleshoot and search Splunk's internal logs: " index="purestorage" OR index=_internal source=splunkd.log PureStorage ".

The configuration of Splunk is complete at this point.


* Note


"From this point on, Splunk starts collecting data directly*." with the reference/"asterisk" I'll add a little Splunk background info at this point.


Before data can be collected, Splunk still creates a data model in the background - this is done automatically. This step can only be followed to a limited extent and can be observed via Splunk Settings > Data Models.




At this point, the data model acceleration for large data models can also be activated. A cache is then maintained for large data models, which accelerates search queries in large datasets.


The Visualization/"PureStorage FlashBlade App For Splunk"


Nothing more needs to be configured in the app for the time being. The indexed data of the FlashBlade are read out-interpreted and visualized graphically. The information obtained at this point is the same information that can also be read out via Pure1. As mentioned in blog article part 1, Splunk unfolds its absolute added value only in consolidated use across multiple systems. At this point, I myself can only see the customizable interval value for system requests, script execution or even event processing (ticket, interaction of a system) as a real added value. The interval value in Pure1 is fixed at 60 seconds and not customizable.


One can of course make adjustments/customizing here within the app, but these will not be considered further in the course of this blog article. A sample excerpt of the integrated system is shown below: