Posted at 02:50 AM in Interesting & Cool | Permalink | Comments (0) | TrackBack (0)
Nice case study about one of the best-of-breed installations Full360 built for BigDoor.
BigDoor decided to move away from its custom ETL solution and move into a more mature data warehousing platform. Malek said he chose Full 360 and Vertica because the solution was both advanced and easy to implement. "There couldn't have been a better fit for what we needed," he said. "Vertica had a reputation for being flat-out fast; Full 360 made the information-gathering, licensing and installation process simple. It was perfect."
New markets, new technologies.
Posted at 08:53 AM in Case Stories | Permalink | Comments (0) | TrackBack (0)
I've been going through my entire library this weekend and finding many treasures, came across the old Tereplex white paper. It contains the classic OLAP definition - the five strengths of Essbase over Relational DBs. Now is a good time to refresh and reconsider in the light of new scalability, pricing and infrastructure. I think, based on these five, that OLAP stands up well. I tend to wonder if the market understands what it can do, given the vast array of products out there.
Here's the Teraplex DW White Paper I excerpt a salient section
Online Analytic Processing (OLAP)
Because OLAP technology provides user and data scalability, performance, read/write capabilities and calculation functionality, it meets all the requirements of a data mart. Two other options— personal productivity tools, and data query and reporting tools—cannot provide the same level of support. Personal productivity tools such as spreadsheets and statistical packages reside on individual PCs, and therefore support only small amounts of data to a single user. Data query and reporting tools are SQL-driven, and frequently used for list-oriented, basic drill-down analysis and report generation. These tools do not offer the predictable performance or robust calculations of OLAP. The OLAP technology option supports collaboration throughout the business management cycle of reporting, analysis, what-if modeling and planning.Most important in OLAP technology are its sophisticated analytic capabilities, including:
Aggregations, which simply add numbers based upon levels defined by the application. For example, the application may call for adding up sales by week, month, quarter and year.
Matrix calculations, which are similar to calculations executed within a standard spreadsheet. For example, variances and ratios are matrix calculations.
Cross-dimensional calculations, which are similar to the calculations executed when spread- sheets are linked and formulas combine cells from different sheets. A percent product share calculation is a good example of this, as it requires the summation of a total and the calculation of percentage contribution to total sales of a given product.
Procedural calculations, in which specific calculation rules are defined and executed in a specific order. For example, allocating advertising expense as a percent of revenue contribution per product is a procedural calculation, requiring procedural logic to properly model and execute sophisticated business rules that accurately reflect the business.
OLAP-aware calculations, which provide the analytical intelligence necessary for multi-dimensional analysis, such as the understanding of hierarchy relationships within dimensions. These calculations include time intelligence and financial intelligence. For example, an OLAP-aware calculation would calculate inventory balances in which Q1 ending inventory is understood not to be the sum of January, February and March inventories.
OLAP technology may be either relational or multidimensional in nature. Relational OLAP tech- nologies, while suitable for large, detail-level sets of data, have inherent weaknesses in a deci- sion-support environment. Response time for decision-support queries in a relational framework can vary from minutes to hours. Calculations are limited to aggregations and simple matrix processing. Changes to metadata structures—for example, the organization of sales territories— usually require manual administrator intervention and re-creation of all summary tables. Typically, these relational solutions are read-only due to security and performance concerns, and therefore cannot support forward-looking modeling, planning or forecasting applications.
In addition, resolving simple OLAP queries, such as: “Show me the top ten and bottom ten products based on sales growth by region, and show the sales of each as a percentage of the total for its brand,” can require hundreds of SQL statements and huge amounts of system resources. For these reasons, many sites that initially deploy these technologies to support ad hoc reporting and analysis are forced to disable access and limit the number of concurrent queries.
For analytic and decision-support applications, implementation and maintenance are often more cumbersome in a relational environment. There are very few tools to define, build or manage relational schemes, forcing developers and consultants to manually design and continually optimize databases, leading to long implementation times. Furthermore, a large IT support staff is required to implement, maintain and update the environment, increasing the overall cost and limiting the IT organization’s capacity to address other strategic information systems projects. Yet another concern is security, as a Relational Database Management Systems (RDBMS) provides table/column security only and cannot easily control access to individual facts in a star schema. The result is that it is often difficult or impossible to provide robust user data access security in an analytic relational database other than at the report level.
Multidimensional technology is free from the limitations that relational databases face in decision-support environments, as multidimensional OLAP delivers sub-second response times while supporting hundreds and thousands of concurrent users. In addition, it supports the full range of calculations, from aggregations to procedural calculations. Companies using Hyperion Essbase are able to rapidly deploy data marts and adapt to changing business environments. Since Hyperion Essbase is a server-centric technology, companies can share information readily and securely, with protection down to the most granular levels. Multiple users can update the database and see the impact of those updates, which is essential in planning and forecasting applications.
A couple notes about these claims.
The OLAP aware query is the most substantial and time-saving aspect of writing with Essbase. It is just as significant now as it ever was, if not more. While I've seen very few applications with full requirements of historically contextual slowly changing dimensions (most people restate), keeping metadata aware queries stable as the dimensions change is almost always a requirement. Dimensions change, your queries shouldn't have to.
Security is still key. The ability to lock down to the cell level and determine sections of the database that are read/write vs read-only is a key differentiator.
Aggregations and matrix calculations can be done quite well in relational tech. In columnar tech, cross-dimensional data can be handled as well, although it takes a bit of doing. But Essbase still shines in procedural and the other two areas.
Whichever way the technology goes, my colleagues and I at Full 360 will offer a broad selection in the best environment. Which brings us to one key paragraph - the one about ROLAP. We've got that handled, and the way we put together our two tiered database environments (when necessary) have managed all of the pain out of staffing for DW development and maintenance. We've come a long way in the past decade. While much of the theory is in force, technologies and practices have moved forward.
Posted at 07:53 AM in Case Stories | Permalink | Comments (0) | TrackBack (0)
Essbase Tuning Notes
20050819.0900
1. Clearblock works like cleardata if you fix on a dense member. It zeroes out the cells. This can work in lieu of CLEARDATA.
2. Dynamic calc & store will work well for large retrieves.
3. Remember "Clearblock Dynamic".
4. Whenever you use dynamic calc & store in the sparse dimension, it will make all of the sparse dynamics, dynamic calc and store. Use this when you are sparse formula intensive. Whenever you do a retrieve of a member combination, and one of the sparse members is dynamic, then everything becomes dynamic rather then just dynamic calc & store.
5. Sometimes an outline restructure marks everything as dirty. Mark them as clean before the users have a chance to actually dirty up the data.
6. HBR screws up EAS installs.
7. Move as many calcs as possible to dynamic members in the outline.
8. Don't forget disk defrag.
9. Perform dynamic aggregations before sparse formulas in sparse dimensions.
10. To enable parallel calcs, put a small flat sparse dim the last one (anchor) in the outline. If you're only calculating a few of the members, you get no advantage. Basicallay the 'task list' is allocated across the number of processors.
11. Don't assume hyperthreading works. Set calcparallel only to the number of CPUs.
Posted at 06:15 PM | Permalink | Comments (2) | TrackBack (0)
Back from the road, it's great to be in front of my own 23 inch monitor once again. When you're as old and grumpy as I am, being away from your own screen starts to get annoying after a month or two.
So I've noticed that the price of monitors has gone way down, and it reminds me of the cyclicals in the tech sector as us geeks with moola waste ours on early adopter stuff and a year later it's in Best Buy. But we didn't adopt Blu Ray, not in ou own CD burners nor much in our home theatres. I always think of high tech consumer electronics in these cycles, and we are due for a revolution in digital audio quality which has really gone to shite but it hasn't come around yet. In the meantime, Best Buy is downsizing, and I still cannot find 'My Fair Lady' in any digital format - let alone Blu Ray. Where a bunch of my time, and yours has probably gone, my geeky friend, is into the catch all cloud.
What is the catch all cloud? It's this.
It's where all of your digital stuff goes if you're a packrat like me. As you can see, I see stuff and route it to different places. A lot of stuff is double backed up in general cloud places like Google Drive and through Backblaze. Clearly, the most important piece is Evernote and relatively little goes, of necessity to the parallelagrames of social media. I think that social media is not organized for findings of fact. But you know I don't - we don't need to go there. My point is that I think a lot of effort will be spent in differentiating cloud hybrids in the future and that's where things are going, not so much to gadgetry and media you can get at Best Buy. Sooner or later, current ETL vendors are going to recognize this.
In other geekery, I've added that expensive Magic Mouse to my inventory in order to get rid of the industrial injury I get from old ratchety mouse wheels. Think a moment. Do you ever get a pain in the first pad of your index finger from mouse-wheeling? Yes. Interestingly, it's not a joint or muscle problem from that repetitive motion, but just the tip of my finger gets sore. No more.
Here's another tip, guys. You would have thought that after all these years there would be something that gets all of the human grease off of glass. Well, I've known for years that the folks at Neutrogena have done that. Their clear (amber) soap gets my face and hands squeaky clean, and it works wonders for my eyeglasses as well. During a day of keyboarding, my fingers start to get a little slick, surely this happens to you as well. A little Neutrogena solution in a small damp cloth works great for the iPad, iPhone, mouse and keyboard. Anything that gets fouled from human hand grease cleans right up. Try it. In fact, the first time you do it, it will be weird, because your mouse will seem suddenly super grippy.
I'm on the horns of a dilemma with regard to what to pack when the primo goes off the college in the fall. On the one hand, I could send him off with an iPad and bluetooth keyboard. On the other hand, a Mac Mini with his current peripherals would be about an equal spend. Hmm. Can't figure that one out yet..
Posted at 12:25 PM in Everyday Geekery | Permalink | Comments (0) | TrackBack (0)
Somewhere in the sands of time that represent my association with Essbase, HP had something to do with a log analysis tool. Or maybe it was Dell. I don't remember. But the bottom line is that they decided that subjecting Essbase database logs to Essbase analysis - like a snake eating its own tail.
The model has User, Action, Server, App/Database, Date, DayOfWeek, Hour of Day.
When you have 300 users of an application and logs going back several years you can make some awesome histograms.
One of the forgotten arts surrounding Essbase development is the use case approach to reporting. So often, the constraint of development resources and more importantly environmental constraints stop organizations from building specific data marts for specific use cases. When this happens IT teams overcommit to building a single cube for all users. This follows an unfortunate misinterpretation of 'a single version of the truth' when in fact it's more like 'too big to fail'. Enormous amounts of time, energy and frustration go into making sure that one cube doesn't disappoint, and yet it inevitably does. It's got one dimension that 70% of the users don't use. It's got a whole slew of old accounts in the dimension that are no longer active. It's carrying history that nobody actually queries. It's not ready for the crush of activity at month end close. And nobody wants to make any changes in production.
All these problems can be solved by re-engineering the cube and our team is working to make all of that easy again. Ouroboros is our extended analytic tool that takes Essbase logs, both the database log and the Essbase agent log and convert them into two Essbase cubes that show user activity. This is nothing particularly earth shattering or novel, but it's part of a more comprehensive package of performance monitoring at the application level that lends itself to making more information available to everyone. This information is actionable because we are drastically reducing the cost of redesign, rearchitecture and our key attribute elasticity. Add this to performance monitoring and application migration at the system level and you gain the ability to add arbitrary amounts of capacity on demand. All with objective evidence generated by the platform itself.
Posted at 07:25 AM in Best Practices | Permalink | Comments (2) | TrackBack (0)
There's a lot of text I'm going to get out of this idea. I know it well. One of the things I always see when I get started with a customer is the belief that implementing this system will solve their problems. It means they have fresh memories of the sales presentation. The promptly forget, because they don't know how to read, the embedded and implied caveats of the salesman's speech, and they specify what they want built to me. I listen and nod. As a designer and architect, I can make it happen. Here's what never gets discussed: process. So at the end of the customer's speech they finally give me the $10 question. Can we really accomplish all this? My answer is (now) "If you really want it."
Or let me put it more snarkily. I'm going to show you how to do more, with less, faster and more accurately, and you are going to tell me no, because it makes you uncomfortable.
It always comes down to that. If you want change, you have to change your process. The technology only facilitates change. It doesn't make change. You have to change.
The Guardian Activate Summit: Clay Shirky Keynote from The Guardian and The Paley Center for Media on FORA.tv
Posted at 04:57 AM in Best Practices | Permalink | Comments (0) | TrackBack (0)
What's new you ask?
Since I've been improving my Ruby and AWS I have found an entire universe of widgets, gadgets and more importantly, methodologies and committed communities that work with excellent tools. It feels a little rarified out in this direction, but I am seeing tools that support some very fine processes - more than I expected and all wide open.
Cucumber is the latest of these tools wedded to a serious practice (The last two I jumped on were Things - GTD and Github - git). The new acronym is BDD meaning Behavior Driven Development. It's an agile methodology that allows you to design code from a testing framework that specifies things in (wait for it) simple English. No 'simple English' is not another open-source object oriented language, it's that thing they teach in American schools, kinda sorta, knamean?
Try this on for size:
Scenario: Get the list of available Media Packages for a Product Pack and Platform
Given I am on the Media Pack Search Page
When I have selected the Product Pack and Platform
Examples:
| Product Pack | Platform |
| Oracle Enterprise Performance Management System | Microsoft Windows x64 |
| Oracle Enterprise Performance Management System | Linux x86-64 |
Then I get a list of Media Packs
This is actually a sample of testing code that we have reverse-engineered from part of the new Full360 product that I call the Software Factory - which is essentially the thing we are building that will allow you to configure and install a fully customized, cloud-based Business Intelligence environment in a matter of minutes. Yeah I said minutes. My personal goal is to see it all done in less time than the single CD installation of Essbase Six - even if it scales to 5000 users, but that's just me. The point is that our specification of software and the test suite is very specific and automated.
What I really love about this is that Cucumber, along with Gherkin and RSpec are the sorts of languages I have been wanting for years. At the time that I was working with eCRM, I had developed an excellent rapport with the software engineering teams that allowed me to leverage my knowledge of the market, customer base and previous product development to help direct product direction. But I knew that being a 'product manager' meant basically writing a rather twisted form of English and I think I would have been bored and constrained to a sort of mental asphyxia. What I really feared was spending 18 months specifying a product without enhancing any technical skill myself - not to mention writing follow-up documents for point releases and all that rot. But happy days are here again. Cucumber runs in a terminal window (yay) and writes pseudo-code templates in Ruby. That's what I'm talking about.
Well, I'm only talking about a little bit considering that I only discovered it yesterday and only just read the Testing section of Eloquent Ruby (my fourth Ruby book). It's yet another reason to be happy in my new shoes.
Posted at 06:36 AM in Everyday Geekery | Permalink | Comments (0) | TrackBack (0)
It has turned out to be remarkably simple to add Google Drive to Vault 330. I already had a sense of what to put into Google Docs and what to put into Dropbox. And then when I took a moment to think about it, what Evernote has come to be. These are the three technologies in competition, along with my blogs, to the question of what I need to store out there vs on my local drives in the Vault. The first three are redundant but easily accessible to me when I am elsewhere. The blogs are always easily accessible elsewhere but have no redundancy that I control. I should remedy that with some scrapers in the future, but for now the question is more about the separation of duties between GDrive and Dropbox.
I think of them essentially as the same service but obviously with a bit more integration on the Dropbox side. With that in mind, I will be moving all of my personal stuff to the GDrive and leaving Dropbox more for business. By business, I mean business documents. Business data will probably remain permanently in the Vault (meaning my own array of external USB drives at home) and semi-permanently in S3 when I need it. Business code is in my private account on Codeplane via git and some at Github and yet more sftp on my own static website hosted by Dreamhost. There will be one exception to the personal vs business and that will be for those of my most personal favorite pictures that I synch with iOS.
The personal documents are all going to GDrive as well as some fraction of my most important photos. Specifically, those photos that I also have on Dropbox will be stored there, somewhere in the iCloud which now seems manifestly transitory, and synched to my iOS devices. The bulk of my photos are still on local disks in the Vault and that's because all of the photo services I use are fairly closed and greedy. It would be nice for my Picasa to read and exchange with my Flickr and both of those to go back and forth with Aperture but it's not all there yet.
Meanwhile, the clear winner for me is Evernote. Anything that I care enough to make into a PDF goes into Evernote. Simple. Plus I do a manual scrape as I read Flipboard. Altogether it's a very nice ecosystem.
Posted at 11:02 AM in Everyday Geekery | Permalink | Comments (0) | TrackBack (0)
All I have to say is this. Google getting into the consumer survey business is the beginning of the end of marketing as we now know it. This was something I rather expected to see and because of that I started fleshing out an idea a few years ago called WWID.
WWID stands for 'what would i do'. I describe it at my other blog. The point is that marketing survey data should belong to you, not to marketing companies. If you want people to market to you individually, they should pay you for the privilege. If you volunteer data into a marketing database that you fractionally own, then this would be possible. But the problem is that you have been volunteering data (involuntarily by clicking terms of service that you never read) that transfers ownership of that data to marketers. Now Google is in the game and that means the game is almost over.
There is an exception. I'll tell you about it next time I talk about behavioral economics.
Posted at 10:48 AM in Information Theory | Permalink | Comments (0) | TrackBack (0)