excel - VBA script to remove duplicates when the gap values between duplicate pair meets certain criteria - Stack Overflow
- c - Solaris 10 make Error code 1 Fatal Error when trying to build python 2.7.16 - Stack Overflow 推荐度:
- javascript - How to dismiss a phonegap notification programmatically - Stack Overflow 推荐度:
- javascript - Get the JSON objects that are not present in another array - Stack Overflow 推荐度:
- javascript - VS 2015 Angular 2 import modules cannot be resolved - Stack Overflow 推荐度:
- javascript - Type 'undefined' is not assignable to type 'menuItemProps[]' - Stack Overflow 推荐度:
- 相关推荐
I'm still a novice in VBA script. I'd like to automatically remove duplicate(s) from their corresponding pairs with the conditions as follows:
- If the gap value of "Score" variable >= 0.02, remove the remaining duplicates by keeping the highest "Score" value
- If the gap value of "Score" variable <= 0.01, remove the remaining duplicates by keeping the higher value in "Peak" variable.
For example,
ID | Score | Peak |
---|---|---|
CompoundA | 0.85 | 300 |
CompoundA | 0.84 | 310 |
Compound_F | 0.75 | 270 |
Compound_F | 0.71 | 250 |
Compound_F | 0.70 | 244 |
Compound_1z | 0.93 | 370 |
Compound_1z | 0.90 | 353 |
Compound_1z | 0.88 | 349 |
I'm still a novice in VBA script. I'd like to automatically remove duplicate(s) from their corresponding pairs with the conditions as follows:
- If the gap value of "Score" variable >= 0.02, remove the remaining duplicates by keeping the highest "Score" value
- If the gap value of "Score" variable <= 0.01, remove the remaining duplicates by keeping the higher value in "Peak" variable.
For example,
ID | Score | Peak |
---|---|---|
CompoundA | 0.85 | 300 |
CompoundA | 0.84 | 310 |
Compound_F | 0.75 | 270 |
Compound_F | 0.71 | 250 |
Compound_F | 0.70 | 244 |
Compound_1z | 0.93 | 370 |
Compound_1z | 0.90 | 353 |
Compound_1z | 0.88 | 349 |
Expected outcome:
ID | Score | Peak |
---|---|---|
CompoundA | 0.84 | 310 |
Compound_F | 0.75 | 270 |
Compound_1z | 0.93 | 370 |
I tried this:
Sub RemoveDuplicatesByScoreAndPeak()
Dim ws As Worksheet
Dim lastRow As Long
Dim i As Long
Dim gap As Double
' Define the worksheet
Set ws = ActiveSheet
' Find the last used row
lastRow = ws.Cells(ws.Rows.Count, 1).End(xlUp).Row
' Sort the data by ID (ascending) and Score (descending)
ws.Sort.SortFields.Clear
ws.Sort.SortFields.Add Key:=ws.Columns(1), Order:=xlAscending
ws.Sort.SortFields.Add Key:=ws.Columns(2), Order:=xlDescending
With ws.Sort
.SetRange ws.Range("A2:C" & lastRow)
.Header = xlYes
.MatchCase = False
.Orientation = xlTopToBottom
.SortMethod = xlPinYin
.Apply
End With
' Loop through rows to remove duplicates
For i = lastRow To 2 Step -1
If ws.Cells(i, 1).Value = ws.Cells(i - 1, 1).Value Then
' Calculate the gap in "Score"
gap = Abs(ws.Cells(i, 2).Value - ws.Cells(i - 1, 2).Value)
If gap >= 0.02 Then
' Keep the row with the highest "Score"
If ws.Cells(i, 2).Value < ws.Cells(i - 1, 2).Value Then
ws.Rows(i).Delete
Else
ws.Rows(i - 1).Delete
End If
ElseIf gap <= 0.01 Then
' Keep the row with the higher "Peak" value
If ws.Cells(i, 3).Value < ws.Cells(i - 1, 3).Value Then
ws.Rows(i).Delete
Else
ws.Rows(i - 1).Delete
End If
End If
End If
Next i
' Notify the user
MsgBox "Duplicates processed successfully!", vbInformation
End Sub
I just want to use the script without loop cause I keep receiving error pop-up message. Thanks in advance.
Share Improve this question edited yesterday Tim Williams 166k8 gold badges100 silver badges137 bronze badges asked yesterday FransFrans 11 New contributor Frans is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct. 4 |2 Answers
Reset to default 0I feel like you have to loop and you know how to build a loop. Your error probably comes from deleting the row above where you are at when it moves up to the next iteration. Excel really only likes deleting the row you are on or previous ones you have visited. And, of course, as you designed it, when deleting rows, you should process from the bottom to the top.
Instead of deleting the row above when conditions are met, don't. Your "If" should only delete the row you are on or the one you came from. In other words, check down, not up.
This seems to work (at least, it matches your sample output)
Sub RemoveDuplicatesByScoreAndPeak()
Dim ws As Worksheet, rngData As Range, data, rwA As Range, rwB As Range
Dim lastRow As Long, rngDel As Range, n As Long, rStart As Long, rComp As Long
Dim compCol As Long, indx As Long, i As Long
Dim gap As Double, groups As Object, k, rng As Range, lr As Long
Set ws = ActiveSheet ' Define the worksheet
lastRow = ws.Cells(ws.Rows.Count, 1).End(xlUp).Row ' Find the last used row
Set rngData = ws.Range("A2:C" & lastRow)
rngData.Font.Color = vbBlack 'when testing using font color to indicate deletes
' Sort the data by ID (ascending) and Score (ascending)
With ws.Sort
.SortFields.Clear
.SortFields.Add Key:=rngData.Columns(1), Order:=xlAscending
.SortFields.Add Key:=rngData.Columns(2), Order:=xlAscending
.SetRange rngData
.Header = xlNo
.MatchCase = False
.Orientation = xlTopToBottom
.SortMethod = xlPinYin
.Apply
End With
Set groups = RowGroups(rngData) 'rows grouped by id
For Each k In groups 'loop over compound id's
Set rng = groups(k)
Debug.Print "Id: " & k, rng.Address
n = rng.Rows.Count
If n > 1 Then 'id has replicates ?
lr = rng.Rows(n).Row 'last row for this id...
Set rwA = rng.Rows(1) 'first pair of rows to compare
Set rwB = rng.Rows(2)
Do
gap = Abs(rwA.Cells(2).Value - rwB.Cells(2).Value)
compCol = IIf(gap >= 0.02, 2, 3) 'set comparison column
If rwA.Cells(compCol).Value < rwB.Cells(compCol).Value Then
BuildRange rngDel, rwA
Set rwA = rwB 'reset the comparison rows...
Set rwB = rwB.Offset(1)
Else
BuildRange rngDel, rwB
Set rwB = rwB.Offset(1) 'next row down
End If
If rwB.Row > lr Then Exit Do 'ran out of rows
Loop
End If
Next k
If Not rngDel Is Nothing Then rngDel.Font.Color = vbRed 'or rngDel.Entirerow.delete
' Notify the user
MsgBox "Duplicates processed successfully!", vbInformation
End Sub
'return a dictionary keyed on column 1 of rngData, with corresponding ranges as values
' assumes data is sorted on at least column 1
Function RowGroups(rngData As Range) As Object 'dictionary of ranges
Dim dict As Object, id, rw As Range
Set dict = CreateObject("scripting.dictionary")
For Each rw In rngData.Rows
id = rw.Cells(1).Value
If Not dict.exists(id) Then
Set dict(id) = rw
Else
Set dict(id) = Application.Union(dict(id), rw)
End If
Next rw
Set RowGroups = dict
End Function
Sub BuildRange(ByRef rngTot As Range, rngAdd As Range)
If rngTot Is Nothing Then
Set rngTot = rngAdd
Else
Set rngTot = Application.Union(rngTot, rngAdd)
End If
End Sub
- IBM联合创新策略伙伴合作启动
- 三问凡客“毁三观”
- reactjs - CORS issue from React to Flask - Stack Overflow
- spring boot - This JPQL translation to SQL doesn't make any sense - Stack Overflow
- node.js - nodejspostgres connect issue - Stack Overflow
- python - How to fix autocomplete menu flickering in PyCharm on Linux? - Stack Overflow
- amazon web services - How to create a CloudWatch alarm for an EventBridge Pipe's stopped state in AWS? - Stack Overflow
- kotlin - Do I need Serialization to Transfer a MutableMap Between Minecraft Server and Client? If so, How Should I Serialize It?
- vue.js - single pages doesn't work when i moved them to app component - Stack Overflow
- Windows Powershell: check if a symlink is "broken" - Stack Overflow
- wordpress - Give access to users own submitted entries only wpforms - Stack Overflow
- reflection - java.lang.reflect.field.set(obj,value) fails for applicationscoped - Stack Overflow
- docker - Unable to Set Up Keycloak with Production Mode - Stack Overflow
- spring boot - Java - Implement @Component class dynmically - Stack Overflow
- caching - How is the userUX affected by cache purge by a WordPress plugin? Cookie consent settings is not affected? - Stack Over
- c# - Unity build error for Android when use CMake with Ninja on Mac - Stack Overflow
- javascript - How to Upload an Image to Supabase Storage and Store the Public URL in a Form Using Zod and React Hook Form in Next
Compound_F
, the gap between the 1st and 2nd rows meets Condition 1, and the gap between the 2nd and 3rd rows meets Condition 2. Why is the first row still included in the output table? – taller Commented yesterday