Salesforce-Platform-Data-Architect Practice Test Questions

UC is rolling out Sales App globally to bring sales teams together on one platform. UC expects millions of opportunities and accounts to be creates and is concerned about the performance of the application. Which 3 recommendations should the data architect make to avoid the data skew? Choose 3 answers.

A.

Use picklist fields rather than lookup to custom object.

B.

Limit assigning one user 10000 records ownership.

C.

Assign 10000 opportunities to one account.

D.

Limit associating 10000 opportunities to one account.

E.

Limit associating 10000 records looking up to same records.

B.

Limit assigning one user 10000 records ownership.

D.

Limit associating 10000 opportunities to one account.

E.

Limit associating 10000 records looking up to same records.

Explanation:

A company is concerned about performance due to millions of opportunities and accounts. They want to avoid data skew.

Key Concepts:
✔️ Data Skew: This occurs when a single record (like an account) owns an excessive number of child records (like opportunities). This can lead to performance issues with sharing rules, ownership changes, and report generation because Salesforce has to re-calculate sharing for a very large number of records.
✔️ Account Data Skew: A single account having more than 10,000 child records (e.g., opportunities, contacts) can cause performance problems. This is known as Account Data Skew.
✔️ Owner Data Skew: A single user owning more than 10,000 records (of the same object type) can cause performance issues. This is known as Owner Data Skew.
✔️ Lookup Skew: A single record being referenced by more than 10,000 child records via a lookup relationship can also create performance bottlenecks. This is a broader category of data skew.

Explanation of Options:

B. Limit assigning one user 10,000 records ownership. This directly addresses Owner Data Skew. A user with too many records can slow down sharing calculations.

D. Limit associating 10,000 opportunities to one account. This directly addresses Account Data Skew. An account with too many child records can cause performance issues, especially with sharing recalculations.

E. Limit associating 10,000 records looking up to same records. This addresses the general issue of Lookup Skew. If a single record is at the end of a lookup relationship for too many records, it can cause performance problems.

By limiting these scenarios, you help maintain a balanced data distribution and ensure better performance.

Universal Containers (UC) provides shipping services to its customers. They use Opportunities to track customer shipments. At any given time, shipping status can be one of the 10 values. UC has 200,000 Opportunity records. When creating a new field to track shipping status on opportunity, what should the architect do to improve data quality and avoid data skew?

A.

Create a picklist field, values sorted alphabetically.

B.

Create a Master -Detail to custom object ShippingStatus c.

C.

Create a Lookup to custom object ShippingStatus c.

D.

Create a text field and make it an external ID.

A.

Create a picklist field, values sorted alphabetically.

Explanation:

Universal Containers needs to add a new field to track shipping status on 200,000 Opportunity records. There are only 10 possible status values. The architect's goal is to ensure data quality and avoid any unnecessary complexity or potential data skew. The solution should be simple, effective, and scalable for the large number of records.

Correct Option Explanation ✅

A. Create a picklist field, values sorted alphabetically.
Using a picklist is the best choice here. It guarantees data quality by limiting user input to the 10 predefined values. This prevents typos and inconsistencies that would make reporting difficult. For a small, static list of values, a picklist is far more efficient than creating a separate custom object and a relationship.

Incorrect Options Explanation ❌

B. Create a Master-Detail to custom object ShippingStatus c.
This option is an example of over-engineering. Creating a new custom object just to hold 10 values is unnecessary and can introduce data skew if one of the records in that object becomes the parent to a large number of opportunities. A picklist is a much simpler and more direct solution.

C. Create a Lookup to custom object ShippingStatus c.
Similar to a Master-Detail relationship, a lookup relationship is not the right tool for this job. It is overly complex for a simple, limited set of values. It also carries the risk of data skew if a single record in the custom object is referenced by many thousands of opportunities, leading to performance issues.

D. Create a text field and make it an external ID.
This approach would severely compromise data quality. A text field allows free-form text entry, which means users could enter "shipped", "Shipped", or "SHPD" for the same status. This would make it impossible to get accurate reports and would not be an acceptable solution for a Data Architect.

Reference: 📘
Trailhead: Data Modeling
Salesforce Help: Data Quality Best Practices

A Customer is migrating 10 million order and 30 million order lines into Salesforce using Bulk API. The Engineer is experiencing time-out errors or long delays querying parents order IDs in Salesforce before importing related order line items. What is the recommended solution?

A.

Query only indexed ID field values on the imported order to import related order lines.

B.

Leverage an External ID from source system orders to import related order lines.

C.

Leverage Batch Apex to update order ID on related order lines after import.

D.

Leverage a sequence of numbers on the imported orders to import related order lines.

B.

Leverage an External ID from source system orders to import related order lines.

Explanation:

A customer is migrating a huge amount of data—10 million orders and 30 million order lines—into Salesforce using the Bulk API. The engineer is facing a common challenge: timeouts and delays when trying to find the Salesforce ID of a parent record to associate with its child records. The solution requires a more efficient way to establish these parent-child relationships during the bulk load.

Correct Option Explanation ✅

B. Leverage an External ID from source system orders to import related order lines.
This is the standard and most efficient solution. During the initial import of orders, the engineer should map the unique ID from the source system to a custom field in Salesforce with the External ID attribute. Then, when importing the order lines, the Bulk API can use this External ID to directly link to the correct parent Order record, eliminating the need to query for a Salesforce ID and preventing timeouts.

Incorrect Options Explanation ❌

A. Query only indexed ID field values on the imported order to import related order lines.
This is the inefficient method the engineer is already using. Even with an indexed ID, querying millions of records in bulk is a very time-consuming operation and can easily lead to API timeouts or long processing delays, making it an unviable solution for a data migration of this scale.

C. Leverage Batch Apex to update order ID on related order lines after import.
This is a two-step, highly inefficient process. It would involve importing all 30 million order lines and then running a separate, resource-intensive Batch Apex job to update the parent IDs. This adds unnecessary complexity, time, and overhead to the migration.

D. Leverage a sequence of numbers on the imported orders to import related order lines.
Relying on a sequence of numbers is a very risky and unreliable approach. There is no guarantee that the records will be processed in the exact order you need. This could result in incorrect or broken parent-child relationships, leading to data integrity issues. An External ID is a reliable and safe alternative.

Reference: 📘
Salesforce Developer: Bulk API 2.0
Salesforce Help: External ID Fields

Universal Containers has more than 10 million records in the Order_c object. The query has timed out when running a bulk query. What should be considered to resolve query timeout?

A.

Tooling API

B.

PK Chunking

C.

Metadata API

D.

Streaming API

B.

PK Chunking

Explanation:

When dealing with large datasets in Salesforce, such as the 10 million records in the Order_c object, a query timeout can occur during bulk operations due to the volume of data being processed. To resolve this issue, PK Chunking (Primary Key Chunking) is the most appropriate solution among the provided options.

PK Chunking is a feature of the Salesforce Bulk API that allows large data queries to be split into smaller, manageable chunks based on the primary key (ID) of the records. This approach breaks down the query into smaller batches, reducing the likelihood of timeouts and improving performance when extracting or processing large volumes of data.

Here’s why the other options are not suitable:

A. Tooling API: The Tooling API is designed for accessing metadata about Salesforce objects, such as layouts, Apex classes, or custom objects, and is not relevant for handling large data queries or resolving query timeouts.

C. Metadata API: The Metadata API is used for managing customizations and configurations in Salesforce, such as deploying or retrieving metadata. It does not address data volume issues or query performance.

D. Streaming API: The Streaming API is used for real-time event monitoring and notifications, not for handling bulk data queries or resolving query timeouts.

How PK Chunking Works:
➡️ PK Chunking is enabled in the Bulk API by adding a header (Sforce-Enable-PKChunking) to the job request.
➡️ Salesforce automatically divides the query results into chunks based on record IDs, typically processing 100,000–250,000 records per chunk (configurable).
➡️ Each chunk is processed as a separate batch, reducing the load on the system and preventing timeouts.

Reference:
Salesforce Documentation: Bulk API 2.0 and PK Chunking
Salesforce Help: Using PK Chunking to Extract Large Data Sets

UC has to built a B2C ecommerce site on Heroku that shares customer and order data with a Heroku Postgres database. UC is currently utilizing Postgres as the single source of truth for both customers and orders. UC has asked a data architect to replicate the data into salesforce so that salesforce can now act as the system of record. What are the 3 considerations that data architect should weigh before implementing this requirement? Choose 3 answers:

A.

Consider whether the data is required for sales reports, dashboards and KPI’s.

B.

Determine if the data is driver of key process implemented within salesforce.

C.

Ensure there is a tight relationship between order data and an enterprise resource plaining (ERP) application.

D.

Ensure the data is CRM center and able to populate standard of custom objects.

E.

A selection of the tool required to replicate the data.

F.

– Heroku Connect is required but this is confusing

A.

Consider whether the data is required for sales reports, dashboards and KPI’s.

B.

Determine if the data is driver of key process implemented within salesforce.

D.

Ensure the data is CRM center and able to populate standard of custom objects.

Explanation:

To transition Salesforce to the system of record for customer and order data from a Heroku Postgres database, the data architect must carefully evaluate several factors. The three most relevant considerations are:

A. Consider whether the data is required for sales reports, dashboards, and KPIs:
Salesforce is often used for reporting and analytics. The architect must determine if the customer and order data from Heroku Postgres is necessary for generating sales reports, dashboards, or KPIs within Salesforce. This ensures the data aligns with business needs and justifies the replication effort. For example, if the data is critical for tracking sales performance or customer metrics, it supports making Salesforce the system of record.

B. Determine if the data is a driver of key processes implemented within Salesforce:
The architect must assess whether the customer and order data drives core Salesforce processes, such as opportunity management, case management, or marketing campaigns. If the data is integral to workflows, automation, or other business processes in Salesforce, it supports the decision to replicate it and make Salesforce the system of record.

D. Ensure the data is CRM-centric and able to populate standard or custom objects:
Salesforce operates on a CRM data model, so the architect must confirm that the Heroku Postgres data (customer and order data) can be mapped to standard objects (e.g., Accounts, Contacts, Orders) or custom objects in Salesforce. This ensures compatibility with Salesforce’s data structure and functionality, enabling seamless integration and use.

Why not the other options?

C. Ensure there is a tight relationship between order data and an enterprise resource planning (ERP) application:
While integration with an ERP system might be relevant in some scenarios, the question does not mention an ERP system or its role in this context. The focus is on replicating data from Heroku Postgres to Salesforce, so this consideration is not directly applicable.

E. A selection of the tool required to replicate the data:
While choosing a tool (e.g., Heroku Connect, custom ETL, or middleware) is important for implementation, it is a tactical decision that comes after determining the need for and compatibility of the data. The question asks for considerations before implementation, making this less relevant than A, B, and D.

F. Heroku Connect is required but this is confusing:
This option is poorly worded and unclear. Heroku Connect is a tool for syncing data between Heroku Postgres and Salesforce, but it is not explicitly "required" unless specified by UC’s architecture. Additionally, this option does not represent a strategic consideration for deciding whether to replicate the data, making it irrelevant.

Reference:
Salesforce Documentation: Heroku Connect Overview
Salesforce Help: Data Integration Considerations

Universal Containers (UC) is transitioning from Classic to Lightning Experience. What does UC need to do to ensure users have access to its notices and attachments in Lightning Experience?

A.

Add Notes and Attachments Related List to page Layout in Lighting Experience.

B.

Manually upload Notes in Lighting Experience.

C.

Migrate Notes and Attachment to Enhanced Notes and Files a migration tool

D.

Manually upload Attachments in Lighting Experience.

C.

Migrate Notes and Attachment to Enhanced Notes and Files a migration tool

Explanation:

In Salesforce Classic, notes and attachments are stored as separate entities (Notes and Attachment objects). In Lightning Experience, Salesforce introduced Enhanced Notes (using the ContentNote object) and Files (using ContentDocument and ContentVersion objects) to provide a more modern and flexible way to manage notes and file attachments. To ensure users can access notes and attachments in Lightning Experience, UC must migrate legacy notes and attachments to these new formats.

✅ C. Migrate Notes and Attachments to Enhanced Notes and Files using a migration tool:
This is the correct approach. Legacy Notes and Attachments from Classic are not automatically compatible with Lightning Experience’s enhanced features. Salesforce provides migration tools, such as the Notes and Attachments to Files Migration Tool or Data Loader, to convert legacy Notes to Enhanced Notes (ContentNote) and Attachments to Files (ContentDocument). This ensures users can access and interact with these records in Lightning Experience, leveraging features like rich text formatting for notes and file sharing capabilities.

❌ Why not the other options?

A. Add Notes and Attachments Related List to page Layout in Lightning Experience:
While adding related lists to page layouts is necessary to display data, legacy Notes and Attachments are not fully supported in Lightning Experience. Simply adding the related list does not address compatibility issues or enable Lightning’s enhanced features for notes and files.

B. Manually upload Notes in Lightning Experience:
Manually uploading notes is impractical and inefficient, especially for organizations with large volumes of existing notes. It also does not address the migration of legacy notes to the Enhanced Notes format.

D. Manually upload Attachments in Lightning Experience:
Similar to option B, manually uploading attachments is not feasible for large datasets and does not convert attachments to the Files format, which is optimized for Lightning Experience.

Reference:
Salesforce Documentation: Migrate Notes and Attachments to Enhanced Notes and Files
Salesforce Help: Enhanced Notes and Files in Lightning

North Trail Outfitters (NTD) is in the process of evaluating big objects to store large amounts of asset data from an external system. NTO will need to report on this asset data weekly. Which two native tools should a data architect recommend to achieve this reporting requirement?

A.

Standard reports and dashboards

B.

Async SOQL with a custom object

C.

Standard SOQL queries

D.

Einstein Analytics

B.

Async SOQL with a custom object

D.

Einstein Analytics

Explanation:

Salesforce Big Objects are designed to store massive volumes of data (e.g., millions or billions of records) in a scalable way, typically for archival or historical data. However, Big Objects have limitations, particularly around reporting, as they do not support standard Salesforce reports or dashboards. To meet NTD’s requirement for weekly reporting on asset data stored in Big Objects, the data architect should recommend tools that can query and analyze Big Object data effectively.

B. Async SOQL with a custom object:
Asynchronous SOQL (Async SOQL) is a native Salesforce tool designed to query large volumes of data in Big Objects. It runs queries asynchronously, allowing complex queries on Big Objects without hitting governor limits. The results of Async SOQL queries can be stored in a custom object, which can then be used for reporting or further analysis. This is a suitable approach for NTD’s weekly reporting needs, as it allows data to be extracted from Big Objects and made available in a format compatible with standard Salesforce reporting tools.

D. Einstein Analytics (now Tableau CRM):
Einstein Analytics (rebranded as Tableau CRM) is a native Salesforce analytics platform that can connect to Big Objects for advanced reporting and visualization. It supports querying and aggregating large datasets, making it ideal for generating weekly reports and dashboards based on asset data stored in Big Objects. Einstein Analytics provides robust visualization capabilities and can handle the scale of Big Object data, meeting NTD’s reporting requirements.

Why not the other options?

A. Standard reports and dashboards:
Big Objects do not support standard Salesforce reports and dashboards. Standard reporting tools are designed for standard and custom objects, not Big Objects, which have restricted query capabilities and cannot be directly used in standard reports.

C. Standard SOQL queries:
Standard SOQL queries are not suitable for Big Objects because they are limited by governor limits and cannot efficiently handle the massive scale of data stored in Big Objects. Async SOQL is specifically designed for this purpose.

Reference:
Salesforce Documentation: Big Objects Overview
Salesforce Documentation: Async SOQL
Salesforce Help: Tableau CRM (Einstein Analytics) for Big Objects

A customer wants to maintain geographic location information including latitude and longitude in a custom object. What would a data architect recommend to satisfy this requirement?

A.

Create formula fields with geolocation function for this requirement.

B.

Create custom fields to maintain latitude and longitude information

C.

Create a geolocation custom field to maintain this requirement

D.

Recommend app exchange packages to support this requirement.

C.

Create a geolocation custom field to maintain this requirement

Explanation:

The most direct and efficient way to store geographic coordinates in Salesforce is by using the dedicated Geolocation custom field type. This single field stores both latitude and longitude as a single, composite data type. This approach is designed for easy use in location-based services, such as calculating distance, and provides a clear, standardized way to handle this type of data.

✅ Correct Option (C) 🗺️:
Create a geolocation custom field to maintain this requirement. This is the correct and recommended solution. The Geolocation data type is purpose-built for this exact requirement. It ensures data is stored in the correct format, simplifies validation, and is compatible with location-based features within Salesforce and mobile applications.

Incorrect Options ❌:

🔴 A. Create formula fields with geolocation function for this requirement.
Formula fields are used to calculate values based on other fields, not to store user-input data. This option is not a valid way to fulfill the requirement.

🔴 B. Create custom fields to maintain latitude and longitude information.
While technically possible, creating two separate number fields for latitude and longitude is a poor design choice. It lacks the native validation and integrated functionality of a single Geolocation field.

🔴 D. Recommend app exchange packages to support this requirement.
While the AppExchange offers many powerful tools, the core requirement of storing coordinates can be met with a standard, native Salesforce feature. A data architect would not recommend an external solution for such a fundamental need.

What makes Skinny tables fast? Choose three answers.

A.

They do not include soft-deleted records

B.

They avoid resource intensive joins

C.

Their tables are kept in sync with their source tables when the source tables are modified

D.

They can contain fields from other objects

E.

They support up to a max of 100 of columns

A.

They do not include soft-deleted records

B.

They avoid resource intensive joins

C.

Their tables are kept in sync with their source tables when the source tables are modified

Explanation:

Skinny tables are a specialized type of custom table created by Salesforce support. They are used to improve the performance of frequently used reports and list views for very large objects. They work by combining key fields from a large object and its related parent objects into a single table, which speeds up queries.

Correct Options ✅:

A. They do not include soft-deleted records.
Skinny tables only contain active, non-archived data. This reduces the overall size of the table, making queries faster. Soft-deleted records are stored separately and are not included in the main table used for reporting.

B. They avoid resource intensive joins.
This is the primary reason for using skinny tables. By combining fields from a parent and child object into one table, they eliminate the need for a costly database join operation, which significantly speeds up report and query performance.

C. Their tables are kept in sync with their source tables when the source tables are modified.
Salesforce ensures that the skinny table is updated whenever a record in the source object is modified. This guarantees that reports and views using the skinny table are always showing the most current data.

Incorrect Options ❌:

D. They can contain fields from other objects.
This statement is not entirely correct. Skinny tables can only contain fields from a parent object and its child object, not from unrelated objects. They are not a general-purpose data consolidation tool.

E. They support up to a max of 100 of columns.
The column limit for skinny tables is 100, so this statement is factually correct. However, it is a technical detail, not a reason for them being "fast." The speed comes from the other three points.

Universal Containers (UC) has multi -level account hierarchies that represent departments within their major Accounts. Users are creating duplicate Contacts across multiple departments. UC wants to clean the data so as to have a single Contact across departments. What two solutions should UC implement to cleanse their data? Choose 2 answers

A.

Make use of a third -party tool to help merge duplicate Contacts across Accounts.

B.

Use Data.com to standardize Contact address information to help identify duplicates.

C.

Use Workflow rules to standardize Contact information to identify and prevent duplicates.

D.

Make use of the Merge Contacts feature of Salesforce to merge duplicates for an Account.

A.

Make use of a third -party tool to help merge duplicate Contacts across Accounts.

B.

Use Data.com to standardize Contact address information to help identify duplicates.

Explanation:

When dealing with duplicate records across multiple accounts, a data architect must recommend a solution that can handle this complexity. Standard Salesforce features are limited in this area.

Correct Options ✅:

🟢 A. Make use of a third-party tool to help merge duplicate Contacts across Accounts.
A third-party AppExchange tool is the ideal solution for this complex scenario. Standard Salesforce functionality only allows merging of contacts within the same account. A specialized tool is required to identify and merge duplicates that exist under different parent accounts.

🟢 B. Use Data.com to standardize Contact address information to help identify duplicates.
Data.com (now D&B Optimizer) is a data service that helps standardize, enrich, and validate data. By standardizing contact information, it becomes much easier to identify and prevent duplicate records from being created in the first place.

Incorrect Options ❌:

🔴 C. Use Workflow rules to standardize Contact information to identify and prevent duplicates.
Workflow rules are an automation tool for updating fields based on simple criteria. They are not a robust solution for standardizing or identifying duplicates across a large dataset.

🔴 D. Make use of the Merge Contacts feature of Salesforce to merge duplicates for an Account.
This is a native Salesforce feature, but it only works for merging duplicate contacts within a single account. It cannot be used to merge contacts that exist under different parent accounts. Therefore, it is not a complete solution for UC's problem.

Universal containers is implementing Salesforce lead management. UC Procure lead data from multiple sources and would like to make sure lead data as company profile and location information. Which solution should a data architect recommend to make sure lead data has both profile and location information?

A.

Ask sales people to search for populating company profile and location data

B.

Run reports to identify records which does not have company profile and location dat

C.

Leverage external data providers populate company profile and location data

D.

Export data out of Salesforce and send to another team to populate company profile and location data

C.

Leverage external data providers populate company profile and location data

Explanation:

This question tests the architect's ability to recommend scalable, automated solutions for data quality and enrichment.

✅ Why C is Correct:
Leveraging external data providers (like ZoomInfo, DiscoverOrg, or similar services) via APIs or managed packages is the most scalable and efficient solution. These services specialize in maintaining accurate and up-to-date company profiles and location data. This approach automates the process, ensures high data quality, and eliminates manual, error-prone work for sales reps. It is a best practice for data enrichment.

❌ Why A is Incorrect:
Relying on salespeople to manually search for and populate this data is inefficient, not scalable, and will result in inconsistent and incomplete data. It takes valuable time away from selling and is a poor user experience.

❌ Why B is Incorrect:
Running reports only identifies the problem; it does not solve it. It is a reactive, not a proactive, measure. Someone would still have to manually fix the records found in the report, which brings us back to the inefficiencies of option A.

❌ Why D is Incorrect:
Exporting data out of Salesforce for another team to manually update is a security risk, breaks data integrity, and is highly inefficient. It introduces latency (data is not updated in real-time) and creates a complex, error-prone process for syncing data back into Salesforce.

Reference:
The core principle here is automation and leveraging specialized tools. A Data Architect should always seek to automate data quality processes rather than rely on manual effort.

Universal Containers has a custom object with millions of rows of data. When executing SOQL queries, which three options prevent a query from being selective? (Choose three.)

A.

Using leading % wildcards.

B.

Using trailing % wildcards.

C.

Performing large loads and deletions.

D.

Using NOT and != operators.

E.

Using a custom index on a deterministic formula field.

A.

Using leading % wildcards.

D.

Using NOT and != operators.

E.

Using a custom index on a deterministic formula field.

Explanation:

This question tests the deep understanding of Salesforce query performance and selectivity, a critical concept for managing large data volumes.

✅ Why A is Correct (Using leading % wildcards):
A query with a leading wildcard (e.g., WHERE Name LIKE '%test%') cannot use an index. The database must perform a full table scan, checking every single record, which is the definition of a non-selective query. This is a major performance killer on large objects.

✅ Why D is Correct (Using NOT and != operators):
Negative operators like NOT, !=, and NOT EQUALS are inherently non-selective. They must evaluate all records that do not match the condition. For example, on a million-record table, a query for WHERE Status != 'Closed' would need to scan almost the entire table if most records are 'Open'.

✅ Why E is Correct (Using a custom index on a deterministic formula field):
This is a tricky one. A custom index on a formula field can make a query selective if the formula is deterministic and the index exists. However, the question asks what prevents selectivity. If you create a custom index on a formula field that is non-deterministic (e.g., it uses functions like TODAY() or NOW()), the index cannot be used, and queries against that field will be non-selective. The question's phrasing is subtle but points to this potential pitfall.

Why B is Incorrect (Using trailing % wildcards):
A query with only a trailing wildcard (e.g., WHERE Name LIKE 'test%') can use an index. This is a selective query pattern because the database can quickly find all records starting with "test" using the index.

Why C is Incorrect (Performing large loads and deletions):
While performing large data operations can fragment indexes and temporarily impact overall database performance, it does not directly change the selectivity of a specific SOQL query's WHERE clause. Selectivity is determined by the structure of the query itself and the available indexes.

Reference:
Salesforce documentation on "Query Optimization" and "Query Selectivity." Key resources include the "Force.com SOQL Best Practices: Selective SOQL Queries" developer guide.