One of the most critical initiatives for any organization involves building a business intelligence infrastructure and solution. Before undertaking any BI design project, it is key to put the proper resources in place for a successful business intelligence implementation and evolution. Gartner, one of the leading information technology research and advisory firms, states that business intelligence projects succeed only 20 to 30 percent of the time. This whitepaper provides a high level summary of objectives, best practices and technical environment methodologies to be used as a general guideline to help your BI Project be one of those successful 20 to 30 percent. Staff, budget, time frame and goals will differ from company to company. Modifications will be necessary based on these details.
Collaborate with business users and technical staff regularly. Business users provide necessary insight into domain knowledge and usability of the data; this helps the technical staff put the right solution in place to achieve the end goals to help improve business function
Involve stakeholders and upper management in high level status meetings and project planning. This allows them to see the effort and understand the process model they are funding and/or standing behind
Utilize subject matter experts in the business and technical areas involved in the project. Business/Application use, Database Design, SSIS, SSAS, SSRS, etc. Inexperience can lead to poor architecture, delayed progress and performance issues
Set up Dev, Test and QA environments where possible; Dev and QA are essential. This allows for separation between currently running production environments and allows use case testing and break/fix before go live
Use visualization prototypes for those less technical to get them excited about the end results, as well as leverage the gap between technical expertise.
Develop teams to separate focus and appoint leads to coordinate between teams. This eliminates one or more persons carrying too much burden and keeps confusion and politics down by controlling communication
Understand the evolution of technology for expandability and compatibility. Scalability is key
Cross-train staff in other areas during the project so each person can understand all running pieces before the solution is in production and each person only knows how their piece works.
Set timelines and milestones for important phases throughout the project. This may help determine the overall success of the project and when may be a good time to abort or postpone to save budget and time
Set a cap on budget to avoid overspending; set it higher than anticipated and above initial specs if possible. This keeps a healthy ceiling and avoids surprises with the end expense total This may also help determine if the project should be aborted to save costs
Document, Document, Document. Have technical writers or people who understand the process thoroughly document the procedures, tools and instructions being put into place. Have the documentation reviewed by people who were not involved with the development to ensure the documentation can be easily understood without prior knowledge
Don’t compromise below requirements. This is particularly true for data warehouses; it’s better to have too much and figure out later you don’t need it, than to not have enough and figure out later you need more. This can be the difference between a complete tear down/ rebuild and a small edit
Choose an interface/reporting tool wisely. Don’t always settle for the most expensive tool; while a lot of the advertised bells and whistles are nice, sometimes the simpler tools are easier to learn and have more than enough features for your company’s needs
Extract, Transform and Load (ETL) describes the process of moving data between databases, tables, files, etc. with the ability to change and conform it to meet business requirements. Within the SQL Server platform, Integration Services (SSIS) is the design tool used to implement such a solution.
Here are some things to consider when planning for an SSIS environment:
Memory. SSIS takes advantage of memory more than any other hardware component from a performance standpoint. Make sure you allocate enough memory on the server outside of whats allocated for the SQL Server database engine to allow as much caching of pipeline data as necessary
Processors. SSIS is configured by default to execute the number of tasks in parallel equal to the number of logical processors plus 2; this setting can also be adjusted
Framework. It’s recommended to set up a framework for SSIS to use for audit and logging purposes as well as for scalability and migrations. Use configurations for 2008 R2 and Prior 2012 is built in using the SSIS catalog
Design. Use variables when possible. Avoid too many parent/child package calls (this can be a troubleshooting nightmare). Consider reusability. Based on the task(s), determine when to let SSIS do the processing and when the SQL database engine will be better
Storage. SSIS 2008 R2 and older, use the package store for database and file system storage. SSIS 2012, use the catalog
Analysis Services within the SQL Server tool stack is used to compile current and historical information in a denormalized schema that allows for faster aggregations, analysis and data mining to help forecast business trends and help make better business decisions.
Here are some things to consider when planning for an SSAS environment:
Processors. SSAS thrives on heavy processing power; because large amounts of data are being aggregated and rolled up, splitting this across as many processors as possible will help improve performance
Disks. When talking about warehouses, you are typically talking about years of historical data. This data is stored in cubes, aggregations and object processing methods cached and stored on disk. Adequate disk space is essential for ever-growing warehouses. RAID 1/0 or RAID 5 configurations are typically ideal from performance standpoint/p>
Memory. This is also essential for running the queries used in processing cubes as well as cube browsing; by default SSAS is set to use 80% of the memory on the server. It’s typically a best practice to separate SSAS and the database engine so they do not compete for resources.
Design. Determine KPI metrics and goals. Establish a grain of fact - Should only consist of measurable data. Realize what kind of schema will work best for your needs (star or snowflake). Use surrogate keys when possible. Define aggregations. Logically partition cubes. Use dimension hierarchies for better usability.
Reporting Services in the SQL Server tool stack is simply a graphical design tool to display your data in whatever output format desired. Analytical data in a data warehouse or transactional data sitting in an applications database system can be viewed in a number of ways with compelling visualizations.
Here are some things to consider when planning for an SSRS environment:
Processor and Memory.The performance of reports is mainly driven by the SQL queries used in the reports; although visual basic expressions can be written and charts and graphs rendered, the processing effort is small. The report server databases can be hosted on any SQL instance but depending on the activity of the system, you may want to consider installing the report server instance on its own server
Design.The main thing to consider here is query design; use efficient query writing techniques to avoid performance bottlenecks. Determine when to build data sets with logic in SQL code and when it may be easier or more efficient to write an expression. Depending on the complexity and frequency of the reports being run, as well as the activity on the source data server, it’s sometimes best to replicate the source data to a read only reporting database or instance to alleviate I/O
Business Intelligence can involve a number of technologies and foster many opportunities for improving your business. This document serves as a guideline for planning strategies to begin successfully building a SQL Server Business Intelligence environment.