**Tom’s Tutorials For Excel: Fuzzy Matching With Partial Strings For Sum or Average**

When you have a mish-mash of alphanumeric strings in a list, here’s how you can sum or average their corresponding numbers based on a partial string criterion.

The picture shows two examples with the criterion in cell B4.

The example on the left evaluates each item in column B for the presence of “354” and sums for column C. The cells that match are B13, B16, and B19, so their corresponding numbers in cells C13, C16, and C19 sum to 168 in blue-colored cell C4, with array formula

`=SUM(IF(ISNUMBER(FIND(B4,B7:B19,1)),C7:C19))`

.

The example on the right evaluates each item in column B for the presence of case insensitive “ATLAS” and averages for column C. The cells that match are B7, B8, B14, B15, and B19, with numbers in C7, C8, C14, C15, C19 averaged in pink-colored cell C4, by array formula

`=AVERAGE(IF(ISNUMBER(FIND(B4,B7:B19,1)),C7:C19))`

.

Recall, array formulas are applied to a cell by simultaneously pressing the Ctrl+Shift+Enter keys, not just Enter. The curly braces are not typed in by you; Excel places them automatically when the array formula is properly installed.

Hi Tom,

For fun, I tried to modify the formula so that the partial string value of 354 can be separated with other text and/or numbers. It seems to be possible but it’s a long formula! I’ll test it and share tomorrow.

Cheers,

Kevin Lehrbass

https://www.youtube.com/user/MySpreadsheetLab

Hi guys,

Is the challenge still open? Didn’t sound as hard to me – solved for SUM & AVERAGE by using 4 & 5 functions respectively: either below 70 chars, and both array entered.

Regards,

Zoran

—

Way to contact me: https://goo.gl/O2cSY8

Dear Sir,

Why did you not use wildcard here ?

Regards,

Vikas.

What wildcard would help? You need to evaluate for the presence of a specific string (in a larger string) such as in the criteria cell B4. If I am looking for “XYZ” and in the string, there exists KJSHXYZLRGY, and I used the “?” wildcard to find ?XYZ? then my returned result would be HXYZL. That can be bypassed with just the specific presence of XYZ using the FIND method for an exact match of that (in this example) 3-character string inside a larger string.

If I’m not mistaken, the asterick wild card would work (as long as cells are text):

SUMIF(B7:B19,”*354*”,C7:C19) OR “*ATLAS*”

AVERAGEIF can be used for averages.

Hi Dan, thanks. Also I see I missed a previous, similar comment from Vikas, and yes, you are right.

I can’t remember exactly, so I need to go back to my notes from when I wrote this post, but something back then struck me about running into unreliable results with wildcards as opposed to FIND. I’ll go back and see, with an example to show (if it turns out to be the case), a scenario I seem to recall coming across that fails with wildcards. In the meantime I wanted to reply and thank you (and Vikas belatedly) for your comments, much appreciated.