SPLK-2002 Practice Test Questions

160 Questions


Which of the following are true statements about Splunk indexer clustering?


A. All peer nodes must run exactly the same Splunk version.


B. The master node must run the same or a later Splunk version than search heads.


C. The peer nodes must run the same or a later Splunk version than the master node.


D. The search head must run the same or a later Splunk version than the peer nodes.





A.
  All peer nodes must run exactly the same Splunk version.

B.
  The master node must run the same or a later Splunk version than search heads.

Explanation:

✅Correct Answers:

A. All peer nodes must run exactly the same Splunk version.
In an indexer cluster, all peer nodes must run the identical Splunk Enterprise version, including the same build number. This strict requirement ensures compatibility and stable communication between peers as they manage data replication and recovery.

B. The master node must run the same or a later Splunk version than search heads.
The cluster master can be the same version or a newer version than any connected search head. This ensures the master can properly communicate with and manage the search heads that are interacting with the cluster.

❌Incorrect Answers:
C. The peer nodes must run the same or a later Splunk version than the master node.
This is false. Peer nodes can run the same version as the master, but during a rolling upgrade, they often run a newer version temporarily. The master can manage peers running a later version.

D. The search head must run the same or a later Splunk version than the peer nodes.
This is incorrect. A search head can connect to an indexer cluster where the peer nodes are running a later Splunk version. The search head's version must simply be compatible.

Reference:
Splunk Documentation, "Indexer cluster upgrade overview".

Which of the following is a best practice to maximize indexing performance?


A. Use automatic sourcetyping.


B. Use the Splunk default settings.


C. Not use pre-trained source types.


D. Minimize configuration generality





D.
  Minimize configuration generality

Explanation:
When working with Splunk Enterprise, indexing performance is critical because it directly impacts ingestion speed, search responsiveness, and system scalability. One of the most important best practices is to minimize configuration generality. This means avoiding overly broad or generic configurations that force Splunk to perform unnecessary parsing or evaluation during indexing. Instead, configurations should be precise, tailored to the data source, and optimized for the specific fields or sourcetypes being ingested. By reducing generality, Splunk spends less time interpreting ambiguous rules, which results in faster indexing throughput and more efficient resource usage.

Why D is Correct:
Precise configurations reduce overhead during parsing and indexing.
Generic rules (e.g., catch‑all regex or broad sourcetypes) slow down indexing because Splunk must evaluate more conditions.
Optimized sourcetypes and field extractions ensure Splunk only processes what is necessary, maximizing performance.
This aligns with Splunk’s official guidance: indexing performance improves when configurations are streamlined and specific.

❌ Why the Other Options Are Incorrect:
A. Use automatic sourcetyping → Incorrect. Automatic sourcetyping can lead to misclassification and extra parsing overhead. Best practice is to define sourcetypes explicitly.

B. Use the Splunk default settings → Incorrect. Defaults are designed for general use, not optimized for performance. Tuning is required for high‑volume environments.

C. Not use pre‑trained source types → Incorrect. Pre‑trained sourcetypes are optimized by Splunk engineers and often improve performance. Avoiding them can increase workload.

📚 References:

Splunk Lantern – Performance tuning the indexing tier
Splunk Docs – Optimize indexing and search processes

A Splunk architect has inherited the Splunk deployment at Buttercup Games and end users are complaining that the events are inconsistently formatted for a web sourcetype. Further investigation reveals that not all web logs flow through the same infrastructure: some of the data goes through heavy forwarders and some of the forwarders are managed by another department. Which of the following items might be the cause for this issue?


A. The search head may have different configurations than the indexers.


B. The data inputs are not properly configured across all the forwarders.


C. The indexers may have different configurations than the heavy forwarders.


D. The forwarders managed by the other department are an older version than the rest





C.
  The indexers may have different configurations than the heavy forwarders.

Explanation:

✅Correct Answer:
The indexers and heavy forwarders can both parse data and assign source types. If they have different configuration files (like props.conf), they will process the same raw data differently. This leads to inconsistent field extraction and event formatting when the data is searched.

❌ Incorrect Answers:

A. The search head may have different configurations than the indexers.
Search heads primarily handle search-time operations, not initial parsing. Inconsistent event formatting is typically a parsing issue that occurs at index time, before the data reaches the search head.

B. The data inputs are not properly configured across all the forwarders.
Incorrect input configuration would likely cause data to not be collected at all, or sent to the wrong index. It would not typically cause the formatting of successfully collected events to be inconsistent.

D. The forwarders managed by the other department are an older version than the rest.
While version mismatches can cause problems, they are less likely to be the direct cause of inconsistent event formatting. Different parsing configurations are a much more common and direct cause of this specific symptom.

Reference:
Splunk Documentation, "Where you can configure data parsing".

When adding or rejoining a member to a search head cluster, the following error is displayed: Error pulling configurations from the search head cluster captain; consider performing a destructive configuration resync on this search head cluster member.What corrective action should be taken?


A. Restart the search head.


B. Run the splunk apply shcluster-bundle command from the deployer.


C. Run the clean raft command on all members of the search head cluster.


D. Run the splunk resync shcluster-replicated-config command on this member.





D.
  Run the splunk resync shcluster-replicated-config command on this member.

Explanation:
In Splunk Search Head Clustering, when a member is added or rejoined and fails to pull configurations from the cluster captain, the error message suggests a destructive configuration resync. The corrective action is to run:
This command forces the problematic search head to discard its local replicated configuration and resynchronize with the cluster captain. It is specifically designed to resolve configuration inconsistencies that prevent a member from properly joining or rejoining the cluster.

❌ Why the other options are incorrect:

A. Restart the search head → Incorrect A restart alone does not resolve configuration mismatches. The error requires a resync, not just a reboot.

B. Run the splunk apply shcluster-bundle command from the deployer → Incorrect This command pushes configuration bundles from the deployer to the cluster, but the error is about a member failing to sync with the captain. The deployer is not involved in this corrective step.

C. Run the clean raft command on all members of the search head cluster → Incorrect clean raft is a last‑resort recovery option for cluster state corruption. It is destructive and not appropriate for resolving a single member’s configuration sync issue.

📚 References:
Splunk Docs – Search Head Cluster Administration
Splunk Docs – Troubleshoot Search Head Clustering
Splunk Enterprise Certified Architect Exam Blueprint

When Splunk indexes data in a non clustered environment, what kind of files does it create by default?


A. Index and .tsidx files.


B. Rawdata and index files.


C. Compressed and .tsidx files.


D. Compressed and meta data files.





A.
  Index and .tsidx files.

Explanation:
When Splunk indexes data, it performs two primary actions: it stores the original raw data, and it creates index files to allow for fast searching. The terminology in the answers can be a bit confusing, so let's clarify what each term refers to.

Why A is Correct: In Splunk's file structure for an index (found in $SPLUNK_HOME/var/lib/splunk/defaultdb/db/), you will find directories called buckets. Each bucket contains two main types of files:
The rawdata file (rawdata): This is a compressed file (using zlib by default) that contains the original, untransformed event data.
The index files (*.tsidx): These are the "time series index" files. They are the key to Splunk's speed. These files contain the sorted lexicon of all the keywords found in the raw data and pointers back to where those events are located in the rawdata file.
While option A uses the general term "index files," in this context, it is understood to be referring to the collection of files that make up the index, with the .tsidx files being the most critical component for search performance. This is the standard and most accurate description of the files Splunk creates.

Why the Other Options Are Incorrect:

B. Rawdata and index files.
This is technically descriptive of the components but is not the precise terminology used by Splunk in an exam context. The "index" part is composed of specific file types, primarily the .tsidx files. Option A is more specific and accurate.

C. Compressed and .tsidx files.
This is misleading. While the rawdata file is indeed compressed, "Compressed" is not a file type or a standard term for a category of files Splunk creates. The correct pair is the compressed rawdata file and the .tsidx index files.

D. Compressed and meta data files.
This is incorrect. While metadata exists (in files like bucketmanifest), the primary, purpose-built files for storing and retrieving data are the rawdata (compressed) and .tsidx (index) files. "Metadata files" is not the standard term for the core indexing components.

Reference
This file structure is documented in the Splunk Enterprise documentation under "How the indexer stores indexes" or "Index structure." The documentation explicitly states:
"A bucket contains the rawdata in a compressed file (rawdata) and the index files (*.tsidx) that point to the rawdata."

What is the default log size for Splunk internal logs?


A. 10MB


B. 20 MB


C. 25MB


D. 30MB





C.
  25MB

Explanation:
Splunk maintains its own internal logs, such as splunkd.log, under $SPLUNK_HOME/var/log/splunk. These logs capture diagnostic information about Splunk’s operation, including indexing activity, search processes, licensing, and system health. To prevent uncontrolled growth of log files, Splunk enforces a default log rotation policy. The most important parameter here is the maximum log file size, which by default is 25MB. Once a log file reaches this threshold, Splunk automatically rotates it, renaming the file and starting a new one. This ensures that logs remain manageable while still retaining sufficient diagnostic history for troubleshooting.

The rotation mechanism is controlled by two key parameters:
maxSize: Default 25MB, defines the maximum size of a log file before rotation occurs.
maxBackupIndex: Default 5, defines how many rotated log files are retained before the oldest is overwritten.
This means Splunk will keep up to five rotated copies of each log file, each capped at 25MB, ensuring administrators have access to recent logs without consuming excessive disk space. This default configuration strikes a balance between operational visibility and resource efficiency.

Why the Other Options Are Incorrect

A. 10MB This is too small compared to Splunk’s default. A 10MB cap would cause frequent rotations, potentially making log review cumbersome and fragmenting diagnostic information. Splunk’s default is deliberately set higher to accommodate verbose logging without overwhelming administrators.

B. 20MB Although closer to the correct value, 20MB is not the default. Splunk documentation explicitly states that the default maximum size is 25MB. Choosing 20MB would misrepresent the actual configuration and could lead to incorrect assumptions in exam scenarios.

D. 30MB This exceeds the default. While administrators can manually configure log size to 30MB or higher in log.cfg or log-local.cfg, the out-of-the-box default remains 25MB. Exam questions test knowledge of defaults, not custom configurations.

References
Splunk Docs – About Splunk logs

Summary
The default log size for Splunk internal logs is 25MB. This value ensures logs are large enough to capture meaningful diagnostic data but small enough to prevent uncontrolled growth. Other options (10MB, 20MB, 30MB) are incorrect because they do not reflect Splunk’s documented defaults. Administrators can adjust these values, but exam questions focus on defaults, making 25MB the correct answer.

To optimize the distribution of primary buckets; when does primary rebalancing automatically occur? (Select all that apply.)


A. Rolling restart completes.


B. Master node rejoins the cluster.


C. Captain joins or rejoins cluster.


D. A peer node joins or rejoins the cluster.





A.
  Rolling restart completes.

B.
  Master node rejoins the cluster.

D.
  A peer node joins or rejoins the cluster.

Explanation:
Primary bucket rebalancing is a process in Splunk indexer clustering that ensures primary buckets are evenly distributed across all peer nodes. This prevents uneven search loads and optimizes performance. Splunk automatically triggers primary rebalancing under specific conditions:

✅ Correct Answer

A. Rolling restart completes → Correct After a rolling restart of the cluster, Splunk rebalances primary buckets to ensure that the restarted peers are properly reintegrated and the distribution of primaries is even.

B. Master node rejoins the cluster → Correct The cluster master (or cluster manager in newer terminology) is responsible for coordinating bucket assignments. When it rejoins the cluster, it triggers a rebalancing to reassert control and redistribute primaries as needed.

D. A peer node joins or rejoins the cluster → Correct When a peer (indexer) joins or rejoins, the cluster master rebalances primaries to include the new or returning peer in the distribution. This ensures that primaries are spread evenly across all available peers

Incorrect Answer

C. Captain joins or rejoins cluster → Incorrect The captain is part of the search head cluster, not the indexer cluster. Primary bucket rebalancing is an indexer cluster function, so the captain’s status has no effect on primary bucket distribution.

. Operational Note:
Primary rebalancing can also be triggered manually by an administrator using the rebalance primaries command, but the exam focuses on automatic triggers.

Reference:
Splunk Docs – Rebalance indexer cluster primary buckets

The KV store forms its own cluster within a SHC. What is the maximum number of SHC members KV store will form?


A. 25


B. 50


C. 100


D. Unlimited





B.
  50

Explanation:
In a Splunk Search Head Cluster (SHC), the KV Store forms its own cluster to replicate and synchronize knowledge objects, lookup data, and other KV Store collections across all members. The KV Store is built on MongoDB, and Splunk enforces a hard limit of 50 members for this KV Store cluster. This means that even if you configure more than 50 search heads in the SHC, the KV Store replication group will only include up to 50 members. This limit is tied to MongoDB’s supported scaling boundaries and Splunk’s architectural design, ensuring stability and predictable performance.

The KV Store cluster is critical because it maintains consistency of knowledge objects across the SHC. When a search head writes to the KV Store, the data must be replicated across the cluster. Limiting the cluster size to 50 ensures replication remains efficient and avoids excessive overhead. Beyond this number, synchronization would become unreliable and resource-intensive, which is why Splunk explicitly documents the maximum supported limit.

Why the other options are incorrect

A. 25 This is too low. While smaller clusters are common in practice, Splunk officially supports KV Store clustering up to 50 members. Limiting it to 25 would misrepresent the documented maximum and could lead to incorrect assumptions in exam scenarios.

C. 100 Incorrect because Splunk does not support KV Store clusters of this size. Although SHCs can scale to many members for search distribution, the KV Store replication group itself is capped at 50. Choosing 100 would exceed the supported architecture.

D. Unlimited This is entirely wrong. KV Store clustering is not unlimited; it is bound by MongoDB’s replication limits and Splunk’s supported configuration. Unlimited membership would be impractical and unsupported, leading to performance and consistency issues.

References
Splunk Docs – Search head clustering overview

Summary
The KV Store forms its own cluster within a Search Head Cluster, and the maximum number of members it supports is 50. This limit ensures efficient replication and consistency of knowledge objects across the SHC. Options 25, 100, and Unlimited are incorrect because they do not reflect Splunk’s documented defaults and supported architecture. For exam purposes, always remember that 50 is the hard maximum for KV Store clustering.

Which of the following will cause the greatest reduction in disk size requirements for a cluster of N indexers running Splunk Enterprise Security?


A. Setting the cluster search factor to N-1.


B. Increasing the number of buckets per index.


C. Decreasing the data model acceleration range.


D. Setting the cluster replication factor to N-1.





D.
  Setting the cluster replication factor to N-1.

Explanation:
The key to this question is understanding what each factor controls and its relative impact on the total disk footprint across the entire cluster. Let's analyze each option:

Why D is Correct:
The Replication Factor (RF) determines how many total copies of each bucket of data are maintained across the entire indexer cluster. The default and recommended value is 3 (one primary and two replicas). If you have a cluster of N indexers and you set RF = N-1, you are storing nearly a full copy of the entire dataset on every single indexer.

Impact: This has a massive, multiplicative effect on disk usage. For example, in a 4-indexer cluster (N=4), an RF of 3 (N-1) means the raw data is stored 3 times. The total disk space used cluster-wide is approximately 3 times the raw data volume. Reducing the RF to 2 would immediately reduce the total disk footprint by roughly one-third. Reducing it to 1 (not recommended for production) would reduce it by two-thirds. Therefore, changing the replication factor has the most direct and dramatic impact on the overall disk size requirements for the cluster.

Why the Other Options Are Incorrect:

A. Setting the cluster search factor to N-1.
Impact: The Search Factor (SF) determines how many searchable copies of the data exist. It cannot exceed the RF. Changing the SF (e.g., from 2 to 1) does not delete data; it only changes how many copies are made searchable. The non-searchable replica copies still exist on disk to fulfill the replication factor requirement. Therefore, this change has zero impact on the total disk space used by the cluster.

B. Increasing the number of buckets per index.
Impact: This setting controls how many "hot" buckets an index can have before it starts rolling to warm. This is a performance and management tuning knob. It has a negligible, if any, effect on the total amount of data stored. A higher number might slightly increase overhead, but it will not cause a "greatest reduction" in disk size.

C. Decreasing the data model acceleration range.
Impact: Splunk Enterprise Security relies heavily on Data Model Acceleration. Accelerated data models build additional, highly optimized .tsidx files to speed up correlation searches. Reducing the acceleration range (e.g., from 90 days to 30 days) will reduce the disk space used by these acceleration summaries. However, this only affects the accelerated data, which is a small fraction of the total raw data stored in the indexes. The reduction from this change is significant but typically much smaller than the reduction achieved by lowering the replication factor, which affects all raw data.

Reference
This concept is covered in the Splunk Enterprise documentation on "About indexer clustering and storage capacity." The documentation explains that the total storage capacity required is calculated as:
Total Raw Data * Replication Factor * (1 + Overhead Factor)
This formula shows that the Replication Factor is the primary multiplier for total disk usage across the cluster. Changing it is the most effective lever for controlling overall disk consumption.

Which Splunk server role regulates the functioning of indexer cluster?


A. Indexer


B. Deployer


C. Master Node


D. Monitoring Console





C.
  Master Node

Explanation:
In Splunk’s indexer clustering architecture, the Master Node (renamed to Cluster Manager in newer versions) is the central authority that regulates and coordinates the functioning of the cluster. Its responsibilities include:

Assigning and managing replication factor (RF) and search factor (SF) across peer indexers.
Coordinating bucket creation, replication, and rebalancing.
Ensuring data availability and consistency across the cluster.
Handling peer node registration and monitoring their health.
Without the Master Node, the cluster cannot enforce replication or search policies, making it the critical role for regulating cluster functionality.

Why the other options are incorrect

A. Indexer → Indexers are peer nodes that store and search data, but they do not regulate the cluster. They rely on instructions from the Master Node.

B. Deployer → The deployer is used in search head clustering, not indexer clustering. It distributes apps and configurations to search heads, not indexers.

D. Monitoring Console → This is a UI tool for monitoring Splunk components. It provides visibility but does not regulate or control cluster operations.

Reference
Splunk Docs – About indexer clusters

What log file would you search to verify if you suspect there is a problem interpreting a regular expression in a monitor stanza?


A. btool.log


B. metrics.log


C. splunkd.log


D. tailing_processor.log





C.
  splunkd.log

Explanation:
When Splunk reads data from a file (via a monitor stanza in inputs.conf), it goes through a pipeline of processing, including parsing lines, identifying timestamps, and applying custom transformations defined by regular expressions in props.conf and transforms.conf. If there is an error in the configuration or the engine has trouble interpreting a regex at any of these stages, it will log detailed error messages to the main Splunk daemon log.

✅ Why C is Correct:
The splunkd.log file is the primary log for the splunkd process, which encompasses all core Splunk functionalities, including data input and parsing. Errors related to misconfigured or uninterpretable regular expressions in a monitor stanza—such as those in LINE_BREAKER, SHOULD_LINEMERGE, EXTRACT, or REPORT configurations—will generate specific error messages here. You might see warnings or errors pointing to a specific stanza and regex pattern that is causing the problem.

❌Why the Other Options Are Incorrect:

A. btool.log:
btool is a diagnostic command used to check how Splunk resolves configurations from various files (e.g., splunk btool inputs list --debug). It does not produce a persistent btool.log file by default. While btool is excellent for proactively testing your regex and configuration, it is not the log file where runtime interpretation errors are recorded.

B. metrics.log:
This log contains internal performance metrics about the Splunk instance itself (e.g., indexing throughput, search execution times, resource usage). It does not contain error messages about configuration syntax or regex interpretation.

D. tailing_processor.log:
This log is specific to the file input monitoring process. It tracks lower-level activities like which files are being watched, when they are read, and if there are issues with file access (e.g., permission denied). However, it does not handle the higher-level logic of line breaking or field extraction using regex; those errors surface in the more general splunkd.log.

Reference
The official Splunk documentation on "Troubleshoot configuration files" and "Monitor logs and files" consistently points administrators to the splunkd.log file for errors related to data processing and configuration.

Which of the following artifacts are included in a Splunk diag file? (Select all that apply.)


A. OS settings.


B. Internal logs.


C. Customer data.


D. Configuration files.





B.
  Internal logs.

D.
  Configuration files.

Explanation:
A Splunk diag is a collection of files bundled together to help Splunk Support (and administrators) troubleshoot problems with a Splunk instance. It is intentionally designed to provide maximum diagnostic information while protecting sensitive data.

✅Why B is Correct:
The diag includes a comprehensive set of Splunk's internal logs (e.g., splunkd.log, scheduler.log, metrics.log). These logs are essential for understanding the behavior, errors, and performance of the Splunk software itself.

✅ Why D is Correct:
The diag includes the configuration files (*.conf) from directories like $SPLUNK_HOME/etc/system/ and $SPLUNK_HOME/etc/apps/. This is critical for support to see how the instance is set up. However, it's important to note that the diag tool actively redacts known sensitive fields from these configuration files, such as passwords in pass4SymmKey or sslPassword.

❌ Why the Other Options Are Incorrect:

A. OS settings:
This is incorrect. A standard Splunk diag does not collect operating system-level settings, logs, or performance data. While this information can be crucial for troubleshooting, it is considered part of the host system's environment and is not bundled into the Splunk-specific diag. An administrator would typically need to gather this data separately (e.g., using commands like top, vmstat, or checking system logs).

C. Customer data:
This is definitively incorrect. A core principle of the Splunk diag is that it does not include any indexed customer data. It will not bundle any raw data files from the hot/warm buckets or any database files from the KVStore. Its purpose is to collect metadata, configurations, and internal logs about how Splunk is processing data, not the data itself, to avoid exposing sensitive information.

Reference
This is documented in the official Splunk documentation on "What's in a diag?" The documentation explicitly states that the diag contains:
"Configuration files (with sensitive information, such as passwords, removed)... Log files from $SPLUNK_HOME/var/log/splunk/."


Page 2 out of 14 Pages
Previous